{"ID":3052357,"CreatedAt":"2026-06-04T04:41:36.695875263Z","UpdatedAt":"2026-06-06T06:50:57.632975493Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.04536","arxiv_id":"2606.04536","title":"Scaling Self-Evolving Agents via Parametric Memory","abstract":"Existing memory-augmented LLM agents store past experience exclusively in prompt space, as textual summaries or retrieved passages, while keeping model parameters frozen throughout a rollout. Such agents can \\emph{look up} what they have seen but cannot \\emph{learn from} it: their policy is unchanged by experience, and any information dropped from the context is permanently lost. We introduce \\texttt{TMEM}, a self-evolving parametric memory framework in which the agent not only compresses history into explicit memory but also absorbs distilled supervision into fast LoRA weights $Δ_t$ via lightweight online updates, genuinely altering its future behavior within a single episode. We formalize this as an agentic decision process with fast-weight rollout dynamics: actions are sampled from $π_{θ_0+Δ_t}$, while extraction actions produce supervision that updates $Δ_t$ for subsequent decisions. This view makes the extraction policy directly optimizable by RL: training $θ_0$ improves not only task actions but also the quality of the data used for online LoRA adaptation. We further propose SVD-based initialization of the LoRA subspace to accelerate online convergence. Experiments on LoCoMo, LongMemEval-S, multi-objective search, and CL-Bench show that \\texttt{TMEM} consistently outperforms summary-based and retrieval-based baselines across different model scales.","short_abstract":"Existing memory-augmented LLM agents store past experience exclusively in prompt space, as textual summaries or retrieved passages, while keeping model parameters frozen throughout a rollout. Such agents can \\emph{look up} what they have seen but cannot \\emph{learn from} it: their policy is unchanged by experience, and...","url_abs":"https://arxiv.org/abs/2606.04536","url_pdf":"https://arxiv.org/pdf/2606.04536v1","authors":"[\"Tao Ren\",\"Weiyao Luo\",\"Hui Yang\",\"Rongzhi Zhu\",\"Xiang Huang\",\"Yuchuan Wu\",\"Bingxue Chou\",\"Jieping Ye\",\"Jiafeng Liang\",\"Yongbin Li\",\"Yijie Peng\"]","published":"2026-06-03T07:18:31Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"LoRA\"]","has_code":false}
