{"ID":2840527,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.13222","arxiv_id":"2511.13222","title":"Hybrid-Domain Adaptative Representation Learning for Gaze Estimation","abstract":"Appearance-based gaze estimation, aiming to predict accurate 3D gaze direction from a single facial image, has made promising progress in recent years. However, most methods suffer significant performance degradation in cross-domain evaluation due to interference from gaze-irrelevant factors, such as expressions, wearables, and image quality. To alleviate this problem, we present a novel Hybrid-domain Adaptative Representation Learning (shorted by HARL) framework that exploits multi-source hybrid datasets to learn robust gaze representation. More specifically, we propose to disentangle gaze-relevant representation from low-quality facial images by aligning features extracted from high-quality near-eye images in an unsupervised domain-adaptation manner, which hardly requires any computational or inference costs. Additionally, we analyze the effect of head-pose and design a simple yet efficient sparse graph fusion module to explore the geometric constraint between gaze direction and head-pose, leading to a dense and robust gaze representation. Extensive experiments on EyeDiap, MPIIFaceGaze, and Gaze360 datasets demonstrate that our approach achieves state-of-the-art accuracy of $\\textbf{5.02}^{\\circ}$ and $\\textbf{3.36}^{\\circ}$, and $\\textbf{9.26}^{\\circ}$ respectively, and present competitive performances through cross-dataset evaluation. The code is available at https://github.com/da60266/HARL.","short_abstract":"Appearance-based gaze estimation, aiming to predict accurate 3D gaze direction from a single facial image, has made promising progress in recent years. However, most methods suffer significant performance degradation in cross-domain evaluation due to interference from gaze-irrelevant factors, such as expressions, weara...","url_abs":"https://arxiv.org/abs/2511.13222","url_pdf":"https://arxiv.org/pdf/2511.13222v1","authors":"[\"Qida Tan\",\"Hongyu Yang\",\"Wenchao Du\"]","published":"2025-11-17T10:38:50Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":606977,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2840527,"paper_url":"https://arxiv.org/abs/2511.13222","paper_title":"Hybrid-Domain Adaptative Representation Learning for Gaze Estimation","repo_url":"https://github.com/da60266/HARL","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}