{"ID":2861692,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.02512","arxiv_id":"2510.02512","title":"Revisiting Query Variants: The Advantage of Retrieval Over Generation of Query Variants for Effective QPP","abstract":"Leveraging query variants (QVs), i.e., queries with potentially similar information needs to the target query, has been shown to improve the effectiveness of query performance prediction (QPP) approaches. Existing QV-based QPP methods generate QVs facilitated by either query expansion or non-contextual embeddings, which may introduce topical drifts and hallucinations. In this paper, we propose a method that retrieves QVs from a training set (e.g., MS MARCO) for a given target query of QPP. To achieve a high recall in retrieving queries with the most similar information needs as the target query from a training set, we extend the directly retrieved QVs (1-hop QVs) by a second retrieval using their denoted relevant documents (which yields 2-hop QVs). Our experiments, conducted on TREC DL'19 and DL'20, show that the QPP methods with QVs retrieved by our method outperform the best-performing existing generated-QV-based QPP approaches by as much as around 20\\%, on neural ranking models like MonoT5.","short_abstract":"Leveraging query variants (QVs), i.e., queries with potentially similar information needs to the target query, has been shown to improve the effectiveness of query performance prediction (QPP) approaches. Existing QV-based QPP methods generate QVs facilitated by either query expansion or non-contextual embeddings, whic...","url_abs":"https://arxiv.org/abs/2510.02512","url_pdf":"https://arxiv.org/pdf/2510.02512v1","authors":"[\"Fangzheng Tian\",\"Debasis Ganguly\",\"Craig Macdonald\"]","published":"2025-10-02T19:36:58Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[]","has_code":false}
