{"ID":2876899,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.02602","arxiv_id":"2509.02602","title":"Masked Autoencoder Pretraining and BiXLSTM ResNet Architecture for PET/CT Tumor Segmentation","abstract":"The accurate segmentation of lesions in whole-body PET/CT imaging is es-sential for tumor characterization, treatment planning, and response assess-ment, yet current manual workflows are labor-intensive and prone to inter-observer variability. Automated deep learning methods have shown promise but often remain limited by modality specificity, isolated time points, or in-sufficient integration of expert knowledge. To address these challenges, we present a two-stage lesion segmentation framework developed for the fourth AutoPET Challenge. In the first stage, a Masked Autoencoder (MAE) is em-ployed for self-supervised pretraining on unlabeled PET/CT and longitudinal CT scans, enabling the extraction of robust modality-specific representations without manual annotations. In the second stage, the pretrained encoder is fine-tuned with a bidirectional XLSTM architecture augmented with ResNet blocks and a convolutional decoder. By jointly leveraging anatomical (CT) and functional (PET) information as complementary input channels, the model achieves improved temporal and spatial feature integration. Evalua-tion on the AutoPET Task 1 dataset demonstrates that self-supervised pre-training significantly enhances segmentation accuracy, achieving a Dice score of 0.582 compared to 0.543 without pretraining. These findings high-light the potential of combining self-supervised learning with multimodal fu-sion for robust and generalizable PET/CT lesion segmentation. Code will be available at https://github.com/RespectKnowledge/AutoPet_2025_BxLSTM_UNET_Segmentation","short_abstract":"The accurate segmentation of lesions in whole-body PET/CT imaging is es-sential for tumor characterization, treatment planning, and response assess-ment, yet current manual workflows are labor-intensive and prone to inter-observer variability. Automated deep learning methods have shown promise but often remain limited...","url_abs":"https://arxiv.org/abs/2509.02602","url_pdf":"https://arxiv.org/pdf/2509.02602v1","authors":"[\"Moona Mazher\",\"Steven A Niederer\",\"Abdul Qayyum\"]","published":"2025-08-29T20:01:15Z","proceeding":"eess.IV","tasks":"[\"eess.IV\"]","methods":"[]","has_code":false,"code_links":[{"ID":610324,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2876899,"paper_url":"https://arxiv.org/abs/2509.02602","paper_title":"Masked Autoencoder Pretraining and BiXLSTM ResNet Architecture for PET/CT Tumor Segmentation","repo_url":"https://github.com/RespectKnowledge/AutoPet_2025_BxLSTM_UNET_Segmentation","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}