{"ID":2834026,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.02772","arxiv_id":"2512.02772","title":"Towards Unification of Hallucination Detection and Fact Verification for Large Language Models","abstract":"Large Language Models (LLMs) frequently exhibit hallucinations, generating content that appears fluent and coherent but is factually incorrect. Such errors undermine trust and hinder their adoption in real-world applications. To address this challenge, two distinct research paradigms have emerged: model-centric Hallucination Detection (HD) and text-centric Fact Verification (FV). Despite sharing the same goal, these paradigms have evolved in isolation, using distinct assumptions, datasets, and evaluation protocols. This separation has created a research schism that hinders their collective progress. In this work, we take a decisive step toward bridging this divide. We introduce UniFact, a unified evaluation framework that enables direct, instance-level comparison between FV and HD by dynamically generating model outputs and corresponding factuality labels. Through large-scale experiments across multiple LLM families and detection methods, we reveal three key findings: (1) No paradigm is universally superior; (2) HD and FV capture complementary facets of factual errors; and (3) hybrid approaches that integrate both methods consistently achieve state-of-the-art performance. Beyond benchmarking, we provide the first in-depth analysis of why FV and HD diverged, as well as empirical evidence supporting the need for their unification. The comprehensive experimental results call for a new, integrated research agenda toward unifying Hallucination Detection and Fact Verification in LLMs. We have open-sourced all the code, data, and baseline implementation at: https://github.com/oneal2000/UniFact/","short_abstract":"Large Language Models (LLMs) frequently exhibit hallucinations, generating content that appears fluent and coherent but is factually incorrect. Such errors undermine trust and hinder their adoption in real-world applications. To address this challenge, two distinct research paradigms have emerged: model-centric Halluci...","url_abs":"https://arxiv.org/abs/2512.02772","url_pdf":"https://arxiv.org/pdf/2512.02772v1","authors":"[\"Weihang Su\",\"Jianming Long\",\"Changyue Wang\",\"Shiyu Lin\",\"Jingyan Xu\",\"Ziyi Ye\",\"Qingyao Ai\",\"Yiqun Liu\"]","published":"2025-12-02T13:51:01Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.IR\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":606376,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2834026,"paper_url":"https://arxiv.org/abs/2512.02772","paper_title":"Towards Unification of Hallucination Detection and Fact Verification for Large Language Models","repo_url":"https://github.com/oneal2000/UniFact","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
