{"ID":2843768,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.06752","arxiv_id":"2511.06752","title":"Med-SORA: Symptom to Organ Reasoning in Abdomen CT Images","abstract":"Understanding symptom-image associations is crucial for clinical reasoning. However, existing medical multimodal models often rely on simple one-to-one hard labeling, oversimplifying clinical reality where symptoms relate to multiple organs. In addition, they mainly use single-slice 2D features without incorporating 3D information, limiting their ability to capture full anatomical context. In this study, we propose Med-SORA, a framework for symptom-to-organ reasoning in abdominal CT images. Med-SORA introduces RAG-based dataset construction, soft labeling with learnable organ anchors to capture one-to-many symptom-organ relationships, and a 2D-3D cross-attention architecture to fuse local and global image features. To our knowledge, this is the first work to address symptom-to-organ reasoning in medical multimodal learning. Experimental results show that Med-SORA outperforms existing medical multimodal models and enables accurate 3D clinical reasoning.","short_abstract":"Understanding symptom-image associations is crucial for clinical reasoning. However, existing medical multimodal models often rely on simple one-to-one hard labeling, oversimplifying clinical reality where symptoms relate to multiple organs. In addition, they mainly use single-slice 2D features without incorporating 3D...","url_abs":"https://arxiv.org/abs/2511.06752","url_pdf":"https://arxiv.org/pdf/2511.06752v1","authors":"[\"You-Kyoung Na\",\"Yeong-Jun Cho\"]","published":"2025-11-10T06:30:51Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}