{"ID":2856344,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.05121","arxiv_id":"2512.05121","title":"PESTalk: Speech-Driven 3D Facial Animation with Personalized Emotional Styles","abstract":"PESTalk is a novel method for generating 3D facial animations with personalized emotional styles directly from speech. It overcomes key limitations of existing approaches by introducing a Dual-Stream Emotion Extractor (DSEE) that captures both time and frequency-domain audio features for fine-grained emotion analysis, and an Emotional Style Modeling Module (ESMM) that models individual expression patterns based on voiceprint characteristics. To address data scarcity, the method leverages a newly constructed 3D-EmoStyle dataset. Evaluations demonstrate that PESTalk outperforms state-of-the-art methods in producing realistic and personalized facial animations.","short_abstract":"PESTalk is a novel method for generating 3D facial animations with personalized emotional styles directly from speech. It overcomes key limitations of existing approaches by introducing a Dual-Stream Emotion Extractor (DSEE) that captures both time and frequency-domain audio features for fine-grained emotion analysis,...","url_abs":"https://arxiv.org/abs/2512.05121","url_pdf":"https://arxiv.org/pdf/2512.05121v1","authors":"[\"Tianshun Han\",\"Benjia Zhou\",\"Ajian Liu\",\"Yanyan Liang\",\"Du Zhang\",\"Zhen Lei\",\"Jun Wan\"]","published":"2025-10-13T13:21:38Z","proceeding":"cs.GR","tasks":"[\"cs.GR\",\"cs.AI\"]","methods":"[]","has_code":false}