{"ID":2845000,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.05385","arxiv_id":"2511.05385","title":"TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework","abstract":"Retrieval-Augmented Generation (RAG) utilizes external knowledge to augment Large Language Models' (LLMs) reliability. For flexibility, agentic RAG employs autonomous, multi-round retrieval and reasoning to resolve queries. Although recent agentic RAG has improved via reinforcement learning, they often incur substantial token overhead from search and reasoning processes. This trade-off prioritizes accuracy over efficiency. To address this issue, this work proposes TeaRAG, a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps. 1) First, the retrieved content is compressed by augmenting chunk-based semantic retrieval with a graph retrieval using concise triplets. A knowledge association graph is then built from semantic similarity and co-occurrence. Finally, Personalized PageRank is leveraged to highlight key knowledge within this graph, reducing the number of tokens per retrieval. 2) Besides, to reduce reasoning steps, Iterative Process-aware Direct Preference Optimization (IP-DPO) is proposed. Specifically, our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps. This design can produce high-quality preference-pair datasets, supporting iterative DPO to improve reasoning conciseness. Across six datasets, TeaRAG improves the average Exact Match by 4% and 2% while reducing output tokens by 61% and 59% on Llama3-8B-Instruct and Qwen2.5-14B-Instruct, respectively. Code is available at https://github.com/Applied-Machine-Learning-Lab/TeaRAG.","short_abstract":"Retrieval-Augmented Generation (RAG) utilizes external knowledge to augment Large Language Models' (LLMs) reliability. For flexibility, agentic RAG employs autonomous, multi-round retrieval and reasoning to resolve queries. Although recent agentic RAG has improved via reinforcement learning, they often incur substantia...","url_abs":"https://arxiv.org/abs/2511.05385","url_pdf":"https://arxiv.org/pdf/2511.05385v1","authors":"[\"Chao Zhang\",\"Yuhao Wang\",\"Derong Xu\",\"Haoxin Zhang\",\"Yuanjie Lyu\",\"Yuhao Chen\",\"Shuochen Liu\",\"Tong Xu\",\"Xiangyu Zhao\",\"Yan Gao\",\"Yao Hu\",\"Enhong Chen\"]","published":"2025-11-07T16:08:34Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.AI\"]","methods":"[\"RAG\",\"Reinforcement Learning\",\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":607340,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2845000,"paper_url":"https://arxiv.org/abs/2511.05385","paper_title":"TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework","repo_url":"https://github.com/Applied-Machine-Learning-Lab/TeaRAG","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}