{"ID":2835896,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.22294","arxiv_id":"2511.22294","title":"Structure is Supervision: Multiview Masked Autoencoders for Radiology","abstract":"Building robust medical machine learning systems requires pretraining strategies that exploit the intrinsic structure present in clinical data. We introduce Multiview Masked Autoencoder (MVMAE), a self-supervised framework that leverages the natural multi-view organization of radiology studies to learn view-invariant and disease-relevant representations. MVMAE combines masked image reconstruction with cross-view alignment, transforming clinical redundancy across projections into a powerful self-supervisory signal. We further extend this approach with MVMAE-V2T, which incorporates radiology reports as an auxiliary text-based learning signal to enhance semantic grounding while preserving fully vision-based inference. Evaluated on a downstream disease classification task on three large-scale public datasets, MIMIC-CXR, CheXpert, and PadChest, MVMAE consistently outperforms supervised and vision-language baselines. Furthermore, MVMAE-V2T provides additional gains, particularly in low-label regimes where structured textual supervision is most beneficial. Together, these results establish the importance of structural and textual supervision as complementary paths toward scalable, clinically grounded medical foundation models.","short_abstract":"Building robust medical machine learning systems requires pretraining strategies that exploit the intrinsic structure present in clinical data. We introduce Multiview Masked Autoencoder (MVMAE), a self-supervised framework that leverages the natural multi-view organization of radiology studies to learn view-invariant a...","url_abs":"https://arxiv.org/abs/2511.22294","url_pdf":"https://arxiv.org/pdf/2511.22294v4","authors":"[\"Sonia Laguna\",\"Andrea Agostini\",\"Alain Ryser\",\"Samuel Ruiperez-Campillo\",\"Irene Cannistraci\",\"Moritz Vandenhirtz\",\"Stephan Mandt\",\"Nicolas Deperrois\",\"Farhad Nooralahzadeh\",\"Michael Krauthammer\",\"Thomas M. Sutter\",\"Julia E. Vogt\"]","published":"2025-11-27T10:20:51Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.LG\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
