{"ID":2864419,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24004","arxiv_id":"2509.24004","title":"SIE3D: Single-Image Expressive 3D Avatar Generation via Semantic Embedding and Perceptual Expression Loss","abstract":"Generating high-fidelity 3D head avatars from a single image is challenging, as current methods lack fine-grained, intuitive control over expressions via text. This paper proposes SIE3D, a framework that generates expressive 3D avatars from a single image and descriptive text. SIE3D fuses identity features from the image with semantic embedding from text through a novel conditioning scheme, enabling detailed control. To ensure generated expressions accurately match the text, it introduces an innovative perceptual expression loss function. This loss uses a pre-trained expression classifier to regularize the generation process, guaranteeing expression accuracy. Extensive experiments show SIE3D significantly improves controllability and realism, outperforming competitive methods in identity preservation and expression fidelity on a single consumer-grade GPU. Project page: https://huang-zhiqi.github.io/SIE3D/","short_abstract":"Generating high-fidelity 3D head avatars from a single image is challenging, as current methods lack fine-grained, intuitive control over expressions via text. This paper proposes SIE3D, a framework that generates expressive 3D avatars from a single image and descriptive text. SIE3D fuses identity features from the ima...","url_abs":"https://arxiv.org/abs/2509.24004","url_pdf":"https://arxiv.org/pdf/2509.24004v2","authors":"[\"Zhiqi Huang\",\"Dulongkai Cui\",\"Jinglu Hu\"]","published":"2025-09-28T17:56:42Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}