{"ID":2834883,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00872","arxiv_id":"2512.00872","title":"TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models","abstract":"Existing foundation models (FMs) in the medical domain often require extensive fine-tuning or rely on training resource-intensive decoders, while many existing encoders are pretrained with objectives biased toward specific tasks. This illustrates a need for a strong, task-agnostic foundation model that requires minimal fine-tuning beyond feature extraction. In this work, we introduce a suite of task-agnostic pretraining of CT foundation models (TAP-CT): a simple yet effective adaptation of Vision Transformers (ViTs) and DINOv2 for volumetric data, enabling scalable self-supervised pretraining directly on 3D CT volumes. Our approach incorporates targeted modifications to patch embeddings, positional encodings, and volumetric augmentations, making the architecture depth-aware while preserving the simplicity of the underlying architectures. We show that large-scale 3D pretraining on an extensive in-house CT dataset (105K volumes) yields stable, robust frozen representations that generalize strongly across downstream tasks. To promote transparency and reproducibility, and to establish a powerful, low-resource baseline for future research in medical imaging, we will release all pretrained models, experimental configurations, and downstream benchmark code at https://huggingface.co/fomofo/tap-ct-b-3d.","short_abstract":"Existing foundation models (FMs) in the medical domain often require extensive fine-tuning or rely on training resource-intensive decoders, while many existing encoders are pretrained with objectives biased toward specific tasks. This illustrates a need for a strong, task-agnostic foundation model that requires minimal...","url_abs":"https://arxiv.org/abs/2512.00872","url_pdf":"https://arxiv.org/pdf/2512.00872v1","authors":"[\"Tim Veenboer\",\"George Yiasemis\",\"Eric Marcus\",\"Vivien Van Veldhuizen\",\"Cees G. M. Snoek\",\"Jonas Teuwen\",\"Kevin B. W. Groot Lipman\"]","published":"2025-11-30T12:43:15Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.AI\"]","methods":"[\"Vision Transformer\",\"Transformer\"]","has_code":false}
