{"ID":2887445,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.01941","arxiv_id":"2508.01941","title":"Less is More: AMBER-AFNO -- a New Benchmark for Lightweight 3D Medical Image Segmentation","abstract":"We adapt the remote sensing-inspired AMBER model from multi-band image segmentation to 3D medical datacube segmentation. To address the computational bottleneck of the volumetric transformer, we propose the AMBER-AFNO architecture. This approach uses Adaptive Fourier Neural Operators (AFNO) instead of the multi-head self-attention mechanism. Unlike spatial pairwise interactions between tokens, global token mixing in the frequency domain avoids $\\mathcal{O}(N^2)$ attention-weight calculations. As a result, AMBER-AFNO achieves quasi-linear computational complexity and linear memory scaling. This new way to model global context reduces reliance on dense transformers while preserving global contextual modeling capability. By using attention-free spectral operations, our design offers a compact parameterization and maintains a competitive computational complexity. We evaluate AMBER-AFNO on three public datasets: ACDC, Synapse, and BraTS. On these datasets, the model achieves state-of-the-art or near-state-of-the-art results for DSC and HD95. Compared with recent compact CNN and Transformer architectures, our approach yields higher Dice scores while maintaining a compact model size. Overall, our results show that frequency-domain token mixing with AFNO provides a fast and efficient alternative to self-attention mechanisms for 3D medical image segmentation.","short_abstract":"We adapt the remote sensing-inspired AMBER model from multi-band image segmentation to 3D medical datacube segmentation. To address the computational bottleneck of the volumetric transformer, we propose the AMBER-AFNO architecture. This approach uses Adaptive Fourier Neural Operators (AFNO) instead of the multi-head se...","url_abs":"https://arxiv.org/abs/2508.01941","url_pdf":"https://arxiv.org/pdf/2508.01941v2","authors":"[\"Andrea Dosi\",\"Semanto Mondal\",\"Rajib Chandra Ghosh\",\"Massimo Brescia\",\"Giuseppe Longo\"]","published":"2025-08-03T22:31:00Z","proceeding":"eess.IV","tasks":"[\"eess.IV\",\"cs.AI\",\"cs.CV\",\"cs.LG\"]","methods":"[\"Transformer\",\"Convolutional Neural Network\"]","has_code":false}
