{"ID":2885555,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.04086","arxiv_id":"2508.04086","title":"ToolGrad: Efficient Tool-use Dataset Generation with Textual \"Gradients\"","abstract":"Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like depth-first search (DFS). This leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad first constructs valid tool-use chains through an iterative process guided by textual \"gradients\", and then synthesizes corresponding user queries. This \"answer-first\" approach led to ToolGrad-500, a dataset generated with more complex tool use, lower cost, and almost 100% pass rate. Experiments show that ToolGrad models outperform those trained on expensive baseline datasets and proprietary LLMs. The ToolGrad source code, dataset, and models are available at https://github.com/zhongyi-zhou/toolgrad.","short_abstract":"Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like depth-first search (DFS). This leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad firs...","url_abs":"https://arxiv.org/abs/2508.04086","url_pdf":"https://arxiv.org/pdf/2508.04086v2","authors":"[\"Zhongyi Zhou\",\"Kohei Uehara\",\"Haoyu Zhang\",\"Jingtao Zhou\",\"Lin Gu\",\"Ruofei Du\",\"Zheng Xu\",\"Tatsuya Harada\"]","published":"2025-08-06T05:04:00Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\"]","has_code":false,"code_links":[{"ID":611210,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2885555,"paper_url":"https://arxiv.org/abs/2508.04086","paper_title":"ToolGrad: Efficient Tool-use Dataset Generation with Textual \"Gradients\"","repo_url":"https://github.com/zhongyi-zhou/toolgrad","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
