{"ID":2888213,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.23242","arxiv_id":"2507.23242","title":"Annotation-Free Reinforcement Learning Query Rewriting via Verifiable Search Reward","abstract":"Optimizing queries for Retrieval-Augmented Generation (RAG) systems poses a significant challenge, particularly across diverse modal indices. We introduce RL-QR, a novel annotation-free reinforcement learning framework for query rewriting that eliminates the need for costly human-annotated data. By leveraging verifiable search rewards derived from index-aligned synthetic queries, RL-QR overcomes human-annotation dependencies, extending its applicability to various modalities and index domains. Experimental results demonstrate the framework's robustness, achieving substantial retrieval performance gains of up to 3.9$\\times$ on lexical retrievers and 3.5$\\times$ on semantic retrievers on the MTEB VIDORE V2 benchmark for unstructured visual documents, along with consistent 5\\% to 10\\% improvements on MS MARCO v2.1 and internal industrial datasets.","short_abstract":"Optimizing queries for Retrieval-Augmented Generation (RAG) systems poses a significant challenge, particularly across diverse modal indices. We introduce RL-QR, a novel annotation-free reinforcement learning framework for query rewriting that eliminates the need for costly human-annotated data. By leveraging verifiabl...","url_abs":"https://arxiv.org/abs/2507.23242","url_pdf":"https://arxiv.org/pdf/2507.23242v2","authors":"[\"Sungguk Cha\",\"DongWook Kim\",\"Taeseung Hahn\",\"Mintae Kim\",\"Youngsub Han\",\"Byoung-Ki Jeon\"]","published":"2025-07-31T04:55:21Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.CL\",\"cs.LG\"]","methods":"[\"RAG\",\"Reinforcement Learning\"]","has_code":false}