{"ID":2878406,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.17694","arxiv_id":"2508.17694","title":"Semantic Search for Information Retrieval","abstract":"Information retrieval systems have progressed notably from lexical techniques such as BM25 and TF-IDF to modern semantic retrievers. This survey provides a brief overview of the BM25 baseline, then discusses the architecture of modern state-of-the-art semantic retrievers. Advancing from BERT, we introduce dense bi-encoders (DPR), late-interaction models (ColBERT), and neural sparse retrieval (SPLADE). Finally, we examine MonoT5, a cross-encoder model. We conclude with common evaluation tactics, pressing challenges, and propositions for future directions.","short_abstract":"Information retrieval systems have progressed notably from lexical techniques such as BM25 and TF-IDF to modern semantic retrievers. This survey provides a brief overview of the BM25 baseline, then discusses the architecture of modern state-of-the-art semantic retrievers. Advancing from BERT, we introduce dense bi-enco...","url_abs":"https://arxiv.org/abs/2508.17694","url_pdf":"https://arxiv.org/pdf/2508.17694v1","authors":"[\"Kayla Farivar\"]","published":"2025-08-25T06:03:26Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[]","has_code":false}
