{"ID":2862105,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.00910","arxiv_id":"2510.00910","title":"PAL-Net: A Point-Wise CNN with Patch-Attention for 3D Facial Landmark Localization","abstract":"Manual annotation of anatomical landmarks on 3D facial scans is a time-consuming and expertise-dependent task, yet it remains critical for clinical assessments, morphometric analysis, and craniofacial research. While several deep learning methods have been proposed for facial landmark localization, most focus on pseudo-landmarks or require complex input representations, limiting their clinical applicability. This study presents a fully automated deep learning pipeline (PAL-Net) for localizing 50 anatomical landmarks on stereo-photogrammetry facial models. The method combines coarse alignment, region-of-interest filtering, and an initial approximation of landmarks with a patch-based pointwise CNN enhanced by attention mechanisms. Trained and evaluated on 214 annotated scans from healthy adults, PAL-Net achieved a mean localization error of 3.686 mm and preserves relevant anatomical distances with a 2.822 mm average error, comparable to intra-observer variability. To assess generalization, the model was further evaluated on 700 subjects from the FaceScape dataset, achieving a point-wise error of 0.41\\,mm and a distance-wise error of 0.38\\,mm. Compared to existing methods, PAL-Net offers a favorable trade-off between accuracy and computational cost. While performance degrades in regions with poor mesh quality (e.g., ears, hairline), the method demonstrates consistent accuracy across most anatomical regions. PAL-Net generalizes effectively across datasets and facial regions, outperforming existing methods in both point-wise and structural evaluations. It provides a lightweight, scalable solution for high-throughput 3D anthropometric analysis, with potential to support clinical workflows and reduce reliance on manual annotation. Source code can be found at https://github.com/Ali5hadman/PAL-Net-A-Point-Wise-CNN-with-Patch-Attention","short_abstract":"Manual annotation of anatomical landmarks on 3D facial scans is a time-consuming and expertise-dependent task, yet it remains critical for clinical assessments, morphometric analysis, and craniofacial research. While several deep learning methods have been proposed for facial landmark localization, most focus on pseudo...","url_abs":"https://arxiv.org/abs/2510.00910","url_pdf":"https://arxiv.org/pdf/2510.00910v2","authors":"[\"Ali Shadman Yazdi\",\"Annalisa Cappella\",\"Benedetta Baldini\",\"Riccardo Solazzo\",\"Gianluca Tartaglia\",\"Chiarella Sforza\",\"Giuseppe Baselli\"]","published":"2025-10-01T13:52:35Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Convolutional Neural Network\"]","has_code":false,"code_links":[{"ID":608871,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2862105,"paper_url":"https://arxiv.org/abs/2510.00910","paper_title":"PAL-Net: A Point-Wise CNN with Patch-Attention for 3D Facial Landmark Localization","repo_url":"https://github.com/Ali5hadman/PAL-Net-A-Point-Wise-CNN-with-Patch-Attention","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}