{"ID":2847713,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.27680","arxiv_id":"2510.27680","title":"PETAR: Localized Findings Generation with Mask-Aware Vision-Language Modeling for PET Automated Reporting","abstract":"Generating automated reports for 3D positron emission tomography (PET) is an important and challenging task in medical imaging. PET plays a vital role in oncology, but automating report generation is difficult due to the complexity of whole-body 3D volumes, the wide range of potential clinical findings, and the limited availability of annotated datasets. To address these challenges, we introduce PETARSeg-11K, the first large-scale, publicly available dataset that provides lesion-level correspondence between 3D PET/CT volumes and free-text radiological findings. It comprises 11,356 lesion descriptions paired with 3D segmentations. Second, we propose PETAR-4B, a 3D vision-language model designed for mask-aware, spatially grounded PET/CT reporting. PETAR-4B jointly encodes PET, CT, and 3D lesion segmentation masks, using a 3D focal prompt to capture fine-grained details of lesions that normally comprise less than 0.1% of the volume. Evaluations using automated metrics show PETAR-4B substantially outperforming all 2D and 3D baselines. A human study involving five physicians -- the first of its kind for automated PET reporting -- confirms the model's clinical utility and establishes correlations between automated metrics and expert judgment. This work provides a foundational dataset and a novel architecture, advancing 3D medical vision-language understanding in PET.","short_abstract":"Generating automated reports for 3D positron emission tomography (PET) is an important and challenging task in medical imaging. PET plays a vital role in oncology, but automating report generation is difficult due to the complexity of whole-body 3D volumes, the wide range of potential clinical findings, and the limited...","url_abs":"https://arxiv.org/abs/2510.27680","url_pdf":"https://arxiv.org/pdf/2510.27680v2","authors":"[\"Danyal Maqbool\",\"Changhee Lee\",\"Zachary Huemann\",\"Samuel D. Church\",\"Matthew E. Larson\",\"Scott B. Perlman\",\"Tomas A. Romero\",\"Joshua D. Warner\",\"Meghan Lubner\",\"Xin Tie\",\"Jameson Merkow\",\"Junjie Hu\",\"Steve Y. Cho\",\"Tyler J. Bradshaw\"]","published":"2025-10-31T17:49:01Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Language Model\"]","has_code":false}
