{"ID":2851565,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.19318","arxiv_id":"2510.19318","title":"HAD: HAllucination Detection Language Models Based on a Comprehensive Hallucination Taxonomy","abstract":"The increasing reliance on natural language generation (NLG) models, particularly large language models, has raised concerns about the reliability and accuracy of their outputs. A key challenge is hallucination, where models produce plausible but incorrect information. As a result, hallucination detection has become a critical task. In this work, we introduce a comprehensive hallucination taxonomy with 11 categories across various NLG tasks and propose the HAllucination Detection (HAD) models https://github.com/pku0xff/HAD, which integrate hallucination detection, span-level identification, and correction into a single inference process. Trained on an elaborate synthetic dataset of about 90K samples, our HAD models are versatile and can be applied to various NLG tasks. We also carefully annotate a test set for hallucination detection, called HADTest, which contains 2,248 samples. Evaluations on in-domain and out-of-domain test sets show that our HAD models generally outperform the existing baselines, achieving state-of-the-art results on HaluEval, FactCHD, and FaithBench, confirming their robustness and versatility.","short_abstract":"The increasing reliance on natural language generation (NLG) models, particularly large language models, has raised concerns about the reliability and accuracy of their outputs. A key challenge is hallucination, where models produce plausible but incorrect information. As a result, hallucination detection has become a...","url_abs":"https://arxiv.org/abs/2510.19318","url_pdf":"https://arxiv.org/pdf/2510.19318v1","authors":"[\"Fan Xu\",\"Xinyu Hu\",\"Zhenghan Yu\",\"Li Lin\",\"Xu Zhang\",\"Yang Zhang\",\"Wei Zhou\",\"Jinjie Gu\",\"Xiaojun Wan\"]","published":"2025-10-22T07:28:37Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Language Model\"]","has_code":false,"code_links":[{"ID":607917,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2851565,"paper_url":"https://arxiv.org/abs/2510.19318","paper_title":"HAD: HAllucination Detection Language Models Based on a Comprehensive Hallucination Taxonomy","repo_url":"https://github.com/pku0xff/HAD","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}