{"ID":2859218,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.08603","arxiv_id":"2510.08603","title":"YpathRAG:A Retrieval-Augmented Generation Framework and Benchmark for Pathology","abstract":"Large language models (LLMs) excel on general tasks yet still hallucinate in high-barrier domains such as pathology. Prior work often relies on domain fine-tuning, which neither expands the knowledge boundary nor enforces evidence-grounded constraints. We therefore build a pathology vector database covering 28 subfields and 1.53 million paragraphs, and present YpathRAG, a pathology-oriented RAG framework with dual-channel hybrid retrieval (BGE-M3 dense retrieval coupled with vocabulary-guided sparse retrieval) and an LLM-based supportive-evidence judgment module that closes the retrieval-judgment-generation loop. We also release two evaluation benchmarks, YpathR and YpathQA-M. On YpathR, YpathRAG attains Recall@5 of 98.64%, a gain of 23 percentage points over the baseline; on YpathQA-M, a set of the 300 most challenging questions, it increases the accuracies of both general and medical LLMs by 9.0% on average and up to 15.6%. These results demonstrate improved retrieval quality and factual reliability, providing a scalable construction paradigm and interpretable evaluation for pathology-oriented RAG.","short_abstract":"Large language models (LLMs) excel on general tasks yet still hallucinate in high-barrier domains such as pathology. Prior work often relies on domain fine-tuning, which neither expands the knowledge boundary nor enforces evidence-grounded constraints. We therefore build a pathology vector database covering 28 subfield...","url_abs":"https://arxiv.org/abs/2510.08603","url_pdf":"https://arxiv.org/pdf/2510.08603v1","authors":"[\"Deshui Yu\",\"Yizhi Wang\",\"Saihui Jin\",\"Taojie Zhu\",\"Fanyi Zeng\",\"Wen Qian\",\"Zirui Huang\",\"Jingli Ouyang\",\"Jiameng Li\",\"Zhen Song\",\"Tian Guan\",\"Yonghong He\"]","published":"2025-10-07T08:47:59Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}