Image-based Facial Rig Inversion

eess.IV arXiv:2510.13933
View PDF arXiv JSON

Abstract

We present an image-based rig inversion framework that leverages two modalities: RGB appearance and RGB-encoded normal maps. Each modality is processed by an independent Hiera transformer backbone, and the extracted features are fused to regress 102 rig parameters derived from the Facial Action Coding System (FACS). Experiments on synthetic and scanned datasets demonstrate that the method generalizes to scanned data, producing faithful reconstructions.

PDF Viewer