{"ID":2838870,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.15996","arxiv_id":"2511.15996","title":"QueryGym: A Toolkit for Reproducible LLM-Based Query Reformulation","abstract":"We present QueryGym, a lightweight, extensible Python toolkit that supports large language model (LLM)-based query reformulation. This is an important tool development since recent work on llm-based query reformulation has shown notable increase in retrieval effectiveness. However, while different authors have sporadically shared the implementation of their methods, there is no unified toolkit that provides a consistent implementation of such methods, which hinders fair comparison, rapid experimentation, consistent benchmarking and reliable deployment. QueryGym addresses this gap by providing a unified framework for implementing, executing, and comparing llm-based reformulation methods. The toolkit offers: (1) a Python API for applying diverse LLM-based methods, (2) a retrieval-agnostic interface supporting integration with backends such as Pyserini and PyTerrier, (3) a centralized prompt management system with versioning and metadata tracking, (4) built-in support for benchmarks like BEIR and MS MARCO, and (5) a completely open-source extensible implementation available to all researchers. QueryGym is publicly available at https://github.com/radinhamidi/QueryGym.","short_abstract":"We present QueryGym, a lightweight, extensible Python toolkit that supports large language model (LLM)-based query reformulation. This is an important tool development since recent work on llm-based query reformulation has shown notable increase in retrieval effectiveness. However, while different authors have sporadic...","url_abs":"https://arxiv.org/abs/2511.15996","url_pdf":"https://arxiv.org/pdf/2511.15996v1","authors":"[\"Amin Bigdeli\",\"Radin Hamidi Rad\",\"Mert Incesu\",\"Negar Arabzadeh\",\"Charles L. A. Clarke\",\"Ebrahim Bagheri\"]","published":"2025-11-20T02:45:50Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":606813,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2838870,"paper_url":"https://arxiv.org/abs/2511.15996","paper_title":"QueryGym: A Toolkit for Reproducible LLM-Based Query Reformulation","repo_url":"https://github.com/radinhamidi/QueryGym","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
