{"ID":2850758,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.21618","arxiv_id":"2510.21618","title":"DeepAgent: A General Reasoning Agent with Scalable Toolsets","abstract":"Large reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically follow predefined workflows, which limit autonomous and global task completion. In this paper, we introduce DeepAgent, an end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, and action execution within a single, coherent reasoning process. To manage long-horizon interactions, we introduce an autonomous memory folding mechanism that compresses past interactions into structured episodic, working, and tool memories, reducing error accumulation while preserving critical information. To teach general-purpose tool use efficiently and stably, we develop an end-to-end reinforcement learning strategy, namely ToolPO, that leverages LLM-simulated APIs and applies tool-call advantage attribution to assign fine-grained credit to the tool invocation tokens. Extensive experiments on eight benchmarks, including general tool-use tasks (ToolBench, API-Bank, TMDB, Spotify, ToolHop) and downstream applications (ALFWorld, WebShop, GAIA, HLE), demonstrate that DeepAgent consistently outperforms baselines across both labeled-tool and open-set tool retrieval scenarios. The code and demo are available at https://github.com/RUC-NLPIR/DeepAgent.","short_abstract":"Large reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically follow predefined workflows, which limit autonomous and global task completion. In this paper, we introduce DeepAgent, an end-to...","url_abs":"https://arxiv.org/abs/2510.21618","url_pdf":"https://arxiv.org/pdf/2510.21618v3","authors":"[\"Xiaoxi Li\",\"Wenxiang Jiao\",\"Jiarui Jin\",\"Guanting Dong\",\"Jiajie Jin\",\"Yinuo Wang\",\"Hao Wang\",\"Yutao Zhu\",\"Ji-Rong Wen\",\"Yuan Lu\",\"Zhicheng Dou\"]","published":"2025-10-24T16:24:01Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CL\",\"cs.IR\",\"cs.LG\"]","methods":"[\"Reinforcement Learning\",\"Large Language Model\"]","has_code":false,"code_links":[{"ID":607838,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2850758,"paper_url":"https://arxiv.org/abs/2510.21618","paper_title":"DeepAgent: A General Reasoning Agent with Scalable Toolsets","repo_url":"https://github.com/RUC-NLPIR/DeepAgent","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
