{"ID":2887108,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.02841","arxiv_id":"2508.02841","title":"A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering","abstract":"Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges in factual accuracy, hallucinations, and cross-modal misalignment. We introduce a multi-agent system (MAS) designed to support complex reasoning in RVQA, with specialized agents for context understanding, multimodal reasoning, and answer validation. We evaluate our system on a challenging RVQA set curated via model disagreement filtering, comprising consistently hard cases across multiple MLLMs. Extensive experiments demonstrate the superiority and effectiveness of our system over strong MLLM baselines, with a case study illustrating its reliability and interpretability. This work highlights the potential of multi-agent approaches to support explainable and trustworthy clinical AI applications that require complex reasoning.","short_abstract":"Radiology visual question answering (RVQA) provides precise answers to questions about chest X-ray images, alleviating radiologists' workload. While recent methods based on multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have shown promising progress in RVQA, they still face challenges...","url_abs":"https://arxiv.org/abs/2508.02841","url_pdf":"https://arxiv.org/pdf/2508.02841v1","authors":"[\"Ziruo Yi\",\"Jinyu Liu\",\"Ting Xiao\",\"Mark V. Albert\"]","published":"2025-08-04T19:09:52Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.IR\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}
