{"ID":2887828,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.00378","arxiv_id":"2508.00378","title":"CoRGI: Verified Chain-of-Thought Reasoning with Post-hoc Visual Grounding","abstract":"Multimodal reasoning with vision-language models (VLMs) often suffers from hallucinations, as models tend to generate explanations after only a superficial inspection of the image. We present \\textbf{CoRGI}(\\textbf{C}hain \\textbf{o}f \\textbf{R}easoning with \\textbf{G}rounded \\textbf{I}nsights), a framework that enhances reasoning reliability through post-hoc verification of chain-of-thought outputs. Given a VLM-generated rationale, CoRGI decomposes it into step-wise statements, grounds each step in visual evidence, and filters or corrects unsupported claims before producing the final answer. Experiments on five challenging benchmark-VCR, ScienceQA, MMMU, MathVista, and HallusionBenc-demonstrate that CoRGI consistently improves both answer accuracy and explanation faithfulness across multiple VLM backbones, including Qwen-2.5VL, LLaVA-1.6, and Gemma3-12B. Beyond quantitative gains, qualitative analyses further illustrate how the verification process reduces hallucination and strengthens interpretability, suggesting that post-hoc visual grounding is a promising direction for building more trustworthy and transparent multimodal reasoning systems.","short_abstract":"Multimodal reasoning with vision-language models (VLMs) often suffers from hallucinations, as models tend to generate explanations after only a superficial inspection of the image. We present \\textbf{CoRGI}(\\textbf{C}hain \\textbf{o}f \\textbf{R}easoning with \\textbf{G}rounded \\textbf{I}nsights), a framework that enhance...","url_abs":"https://arxiv.org/abs/2508.00378","url_pdf":"https://arxiv.org/pdf/2508.00378v3","authors":"[\"Shixin Yi\",\"Lin Shang\"]","published":"2025-08-01T07:17:12Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false}
