{"ID":2883745,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.07975","arxiv_id":"2508.07975","title":"Improving Document Retrieval Coherence for Semantically Equivalent Queries","abstract":"Dense Retrieval (DR) models have proven to be effective for Document Retrieval and Information Grounding tasks. Usually, these models are trained and optimized for improving the relevance of top-ranked documents for a given query. Previous work has shown that popular DR models are sensitive to the query and document lexicon: small variations of it may lead to a significant difference in the set of retrieved documents. In this paper, we propose a variation of the Multi-Negative Ranking loss for training DR that improves the coherence of models in retrieving the same documents with respect to semantically similar queries. The loss penalizes discrepancies between the top-k ranked documents retrieved for diverse but semantic equivalent queries. We conducted extensive experiments on various datasets, MS-MARCO, Natural Questions, BEIR, and TREC DL 19/20. The results show that (i) models optimizes by our loss are subject to lower sensitivity, and, (ii) interestingly, higher accuracy.","short_abstract":"Dense Retrieval (DR) models have proven to be effective for Document Retrieval and Information Grounding tasks. Usually, these models are trained and optimized for improving the relevance of top-ranked documents for a given query. Previous work has shown that popular DR models are sensitive to the query and document le...","url_abs":"https://arxiv.org/abs/2508.07975","url_pdf":"https://arxiv.org/pdf/2508.07975v1","authors":"[\"Stefano Campese\",\"Alessandro Moschitti\",\"Ivano Lauriola\"]","published":"2025-08-11T13:34:59Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.CL\"]","methods":"[]","has_code":false}
