{"ID":2881502,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.12455","arxiv_id":"2508.12455","title":"X-Ray-CoT: Interpretable Chest X-ray Diagnosis with Vision-Language Models via Chain-of-Thought Reasoning","abstract":"Chest X-ray imaging is crucial for diagnosing pulmonary and cardiac diseases, yet its interpretation demands extensive clinical experience and suffers from inter-observer variability. While deep learning models offer high diagnostic accuracy, their black-box nature hinders clinical adoption in high-stakes medical settings. To address this, we propose X-Ray-CoT (Chest X-Ray Chain-of-Thought), a novel framework leveraging Vision-Language Large Models (LVLMs) for intelligent chest X-ray diagnosis and interpretable report generation. X-Ray-CoT simulates human radiologists' \"chain-of-thought\" by first extracting multi-modal features and visual concepts, then employing an LLM-based component with a structured Chain-of-Thought prompting strategy to reason and produce detailed natural language diagnostic reports. Evaluated on the CORDA dataset, X-Ray-CoT achieves competitive quantitative performance, with a Balanced Accuracy of 80.52% and F1 score of 78.65% for disease diagnosis, slightly surpassing existing black-box models. Crucially, it uniquely generates high-quality, explainable reports, as validated by preliminary human evaluations. Our ablation studies confirm the integral role of each proposed component, highlighting the necessity of multi-modal fusion and CoT reasoning for robust and transparent medical AI. This work represents a significant step towards trustworthy and clinically actionable AI systems in medical imaging.","short_abstract":"Chest X-ray imaging is crucial for diagnosing pulmonary and cardiac diseases, yet its interpretation demands extensive clinical experience and suffers from inter-observer variability. While deep learning models offer high diagnostic accuracy, their black-box nature hinders clinical adoption in high-stakes medical setti...","url_abs":"https://arxiv.org/abs/2508.12455","url_pdf":"https://arxiv.org/pdf/2508.12455v1","authors":"[\"Chee Ng\",\"Liliang Sun\",\"Shaoqing Tang\"]","published":"2025-08-17T18:00:41Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
