{"ID":2823611,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.24615","arxiv_id":"2512.24615","title":"Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization","abstract":"Existing Large Language Model (LLM) agent frameworks face two significant challenges: high configuration costs and static capabilities. Building a high-quality agent often requires extensive manual effort in tool integration and prompt engineering, while deployed agents struggle to adapt to dynamic environments without expensive fine-tuning. To address these issues, we propose \\textbf{Youtu-Agent}, a modular framework designed for the automated generation and continuous evolution of LLM agents. Youtu-Agent features a structured configuration system that decouples execution environments, toolkits, and context management, enabling flexible reuse and automated synthesis. We introduce two generation paradigms: a \\textbf{Workflow} mode for standard tasks and a \\textbf{Meta-Agent} mode for complex, non-standard requirements, capable of automatically generating tool code, prompts, and configurations. Furthermore, Youtu-Agent establishes a hybrid policy optimization system: (1) an \\textbf{Agent Practice} module that enables agents to accumulate experience and improve performance through in-context optimization without parameter updates; and (2) an \\textbf{Agent RL} module that integrates with distributed training frameworks to enable scalable and stable reinforcement learning of any Youtu-Agents in an end-to-end, large-scale manner. Experiments demonstrate that Youtu-Agent achieves state-of-the-art performance on WebWalkerQA (71.47\\%) and GAIA (72.8\\%) using open-weight models. Our automated generation pipeline achieves over 81\\% tool synthesis success rate, while the Practice module improves performance on AIME 2024/2025 by +2.7\\% and +5.4\\% respectively. Moreover, our Agent RL training achieves 40\\% speedup with steady performance improvement on 7B LLMs, enhancing coding/reasoning and searching capabilities respectively up to 35\\% and 21\\% on Maths and general/multi-hop QA benchmarks.","short_abstract":"Existing Large Language Model (LLM) agent frameworks face two significant challenges: high configuration costs and static capabilities. Building a high-quality agent often requires extensive manual effort in tool integration and prompt engineering, while deployed agents struggle to adapt to dynamic environments without...","url_abs":"https://arxiv.org/abs/2512.24615","url_pdf":"https://arxiv.org/pdf/2512.24615v1","authors":"[\"Yuchen Shi\",\"Yuzheng Cai\",\"Siqi Cai\",\"Zihan Xu\",\"Lichao Chen\",\"Yulei Qin\",\"Zhijian Zhou\",\"Xiang Fei\",\"Chaofan Qiu\",\"Xiaoyu Tan\",\"Gang Li\",\"Zongyi Li\",\"Haojia Lin\",\"Guocan Cai\",\"Yong Mao\",\"Yunsheng Wu\",\"Ke Li\",\"Xing Sun\"]","published":"2025-12-31T04:17:36Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Reinforcement Learning\",\"Large Language Model\",\"Language Model\"]","has_code":false}
