{"ID":2845285,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.04255","arxiv_id":"2511.04255","title":"MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection","abstract":"This paper does not introduce a novel architecture; instead, it revisits a fundamental yet overlooked baseline: adapting human-centric foundation models for anatomical landmark detection in medical imaging. While landmark detection has traditionally relied on domain-specific models, the emergence of large-scale pre-trained vision models presents new opportunities. In this study, we investigate the adaptation of Sapiens, a human-centric foundation model designed for pose estimation, to medical imaging through multi-dataset pretraining, establishing a new state of the art across multiple datasets. Our proposed model, MedSapiens, demonstrates that human-centric foundation models, inherently optimized for spatial pose localization, provide strong priors for anatomical landmark detection, yet this potential has remained largely untapped. We benchmark MedSapiens against existing state-of-the-art models, achieving up to 5.26% improvement over generalist models and up to 21.81% improvement over specialist models in the average success detection rate (SDR). To further assess MedSapiens adaptability to novel downstream tasks with few annotations, we evaluate its performance in limited-data settings, achieving 2.69% improvement over the few-shot state of the art in SDR. Code and model weights are available at https://github.com/xmed-lab/MedSapiens .","short_abstract":"This paper does not introduce a novel architecture; instead, it revisits a fundamental yet overlooked baseline: adapting human-centric foundation models for anatomical landmark detection in medical imaging. While landmark detection has traditionally relied on domain-specific models, the emergence of large-scale pre-tra...","url_abs":"https://arxiv.org/abs/2511.04255","url_pdf":"https://arxiv.org/pdf/2511.04255v1","authors":"[\"Marawan Elbatel\",\"Anbang Wang\",\"Keyuan Liu\",\"Kaouther Mouheb\",\"Enrique Almar-Munoz\",\"Lizhuo Lin\",\"Yanqi Yang\",\"Karim Lekadir\",\"Xiaomeng Li\"]","published":"2025-11-06T10:45:49Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.AI\",\"cs.LG\"]","methods":"[]","has_code":false,"code_links":[{"ID":607361,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2845285,"paper_url":"https://arxiv.org/abs/2511.04255","paper_title":"MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection","repo_url":"https://github.com/xmed-lab/MedSapiens","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}