{"ID":2923668,"CreatedAt":"2026-06-02T04:05:25.881865328Z","UpdatedAt":"2026-06-04T13:12:39.622923895Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.02245","arxiv_id":"2606.02245","title":"When Knowledge Is Not Free: Cost-Aware Evidence Selection in Retrieval-Augmented Generation","abstract":"Retrieval-Augmented Generation (RAG) typically assumes that external knowledge is free, but many high-quality sources are paywalled, licensed, restricted, or otherwise costly to access. We introduce cost-aware RAG, a setting where retrieved evidence is assigned access-cost tiers and systems must answer under an explicit evidence-access budget. We instantiate this setting by augmenting MS MARCO v2.1 with access-friction tiers and evaluate budgeted evidence selection across general-domain and domain-specific QA benchmarks. Our results show that static selection is brittle: no fixed selector uniformly dominates, and larger budgets do not reliably improve answer quality, even when costly evidence is domain-matched. We then study agentic cost-aware RAG, where an LLM decides when to retrieve, which tier to access, and when to stop. Agents show strong promise as adaptive evidence-acquisition controllers, but their behavior remains highly model- and task-dependent. These findings suggest that cost-aware evidence acquisition is a central challenge for the next generation of RAG systems. All code and data are available at https://github.com/Mignonmy/Cost-Aware.","short_abstract":"Retrieval-Augmented Generation (RAG) typically assumes that external knowledge is free, but many high-quality sources are paywalled, licensed, restricted, or otherwise costly to access. We introduce cost-aware RAG, a setting where retrieved evidence is assigned access-cost tiers and systems must answer under an explici...","url_abs":"https://arxiv.org/abs/2606.02245","url_pdf":"https://arxiv.org/pdf/2606.02245v1","authors":"[\"Mingyan Wu\",\"Han Yang\",\"Omer Ben-Porat\",\"Yftah Ziser\"]","published":"2026-06-01T13:39:39Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"RAG\",\"Large Language Model\"]","has_code":false,"code_links":[{"ID":612677,"CreatedAt":"2026-06-02T04:05:25.881865328Z","UpdatedAt":"2026-06-02T04:05:25.881865328Z","DeletedAt":null,"paper_id":2923668,"paper_url":"https://arxiv.org/abs/2606.02245","paper_title":"When Knowledge Is Not Free: Cost-Aware Evidence Selection in Retrieval-Augmented Generation","repo_url":"https://github.com/Mignonmy/Cost-Aware","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
