{"ID":2856363,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.11409","arxiv_id":"2510.11409","title":"Leveraging LLMs for Semi-Automatic Corpus Filtration in Systematic Literature Reviews","abstract":"The creation of systematic literature reviews (SLR) is critical for analyzing the landscape of a research field and guiding future research directions. However, retrieving and filtering the literature corpus for an SLR is highly time-consuming and requires extensive manual effort, as keyword-based searches in digital libraries often return numerous irrelevant publications. In this work, we propose a pipeline leveraging multiple large language models (LLMs), classifying papers based on descriptive prompts and deciding jointly using a consensus scheme. The entire process is human-supervised and interactively controlled via our open-source visual analytics web interface, LLMSurver, which enables real-time inspection and modification of model outputs. We evaluate our approach using ground-truth data from a recent SLR comprising over 8,000 candidate papers, benchmarking both open and commercial state-of-the-art LLMs from mid-2024 and fall 2025. Results demonstrate that our pipeline significantly reduces manual effort while achieving lower error rates than single human annotators. Furthermore, modern open-source models prove sufficient for this task, making the method accessible and cost-effective. Overall, our work demonstrates how responsible human-AI collaboration can accelerate and enhance systematic literature reviews within academic workflows.","short_abstract":"The creation of systematic literature reviews (SLR) is critical for analyzing the landscape of a research field and guiding future research directions. However, retrieving and filtering the literature corpus for an SLR is highly time-consuming and requires extensive manual effort, as keyword-based searches in digital l...","url_abs":"https://arxiv.org/abs/2510.11409","url_pdf":"https://arxiv.org/pdf/2510.11409v1","authors":"[\"Lucas Joos\",\"Daniel A. Keim\",\"Maximilian T. Fischer\"]","published":"2025-10-13T13:48:29Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.DL\",\"cs.HC\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
