{"ID":2867073,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.18869","arxiv_id":"2509.18869","title":"On The Reproducibility Limitations of RAG Systems","abstract":"Retrieval-Augmented Generation (RAG) is increasingly employed in generative AI-driven scientific workflows to integrate rapidly evolving scientific knowledge bases, yet its reliability is frequently compromised by non-determinism in their retrieval components. This paper introduces ReproRAG, a comprehensive benchmarking framework designed to systematically measure and quantify the reproducibility of vector-based retrieval systems. ReproRAG investigates sources of uncertainty across the entire pipeline, including different embedding models, precision, retrieval algorithms, hardware configurations, and distributed execution environments. Utilizing a suite of metrics, such as Exact Match Rate, Jaccard Similarity, and Kendall's Tau, the proposed framework effectively characterizes the trade-offs between reproducibility and performance. Our large-scale empirical study reveals critical insights; for instance, we observe that different embedding models have remarkable impact on RAG reproducibility. The open-sourced ReproRAG framework provides researchers and engineers productive tools to validate deployments, benchmark reproducibility, and make informed design decisions, thereby fostering more trustworthy AI for science.","short_abstract":"Retrieval-Augmented Generation (RAG) is increasingly employed in generative AI-driven scientific workflows to integrate rapidly evolving scientific knowledge bases, yet its reliability is frequently compromised by non-determinism in their retrieval components. This paper introduces ReproRAG, a comprehensive benchmarkin...","url_abs":"https://arxiv.org/abs/2509.18869","url_pdf":"https://arxiv.org/pdf/2509.18869v1","authors":"[\"Baiqiang Wang\",\"Dongfang Zhao\",\"Nathan R Tallent\",\"Luanzheng Guo\"]","published":"2025-09-23T10:07:29Z","proceeding":"cs.DC","tasks":"[\"cs.DC\"]","methods":"[\"RAG\"]","has_code":false}
