{"ID":2832646,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.05905","arxiv_id":"2512.05905","title":"SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations","abstract":"Achieving controllable character animation that meets studio-grade standards remains challenging despite recent progress. Existing approaches can transfer motion from a driving video to a reference image, but often fail to preserve structural fidelity and temporal consistency in wild scenarios involving complex motion and cross-identity animations. In this work, we present \\textbf{SCAIL} (a framework toward \\textbf{S}tudio-grade \\textbf{C}haracter \\textbf{A}nimation via \\textbf{I}n-context \\textbf{L}earning), which is designed to address these challenges from two key innovations. First, we propose a novel 3D pose representation, providing a robust and flexible motion signal. Second, we introduce a full-context pose injection mechanism within a diffusion-transformer, enabling effective spatio-temporal reasoning over full motion sequences. To align with studio-grade requirements, we develop a curated data pipeline ensuring both diversity and quality, and establish a comprehensive benchmark for systematic evaluation. Experiments show that \\textbf{SCAIL} achieves state-of-the-art performance and advances character animation toward studio-grade controlling. Code and model are available at \\href{https://github.com/zai-org/SCAIL}{zai-org/SCAIL}.","short_abstract":"Achieving controllable character animation that meets studio-grade standards remains challenging despite recent progress. Existing approaches can transfer motion from a driving video to a reference image, but often fail to preserve structural fidelity and temporal consistency in wild scenarios involving complex motion...","url_abs":"https://arxiv.org/abs/2512.05905","url_pdf":"https://arxiv.org/pdf/2512.05905v3","authors":"[\"Wenhao Yan\",\"Sheng Ye\",\"Zhuoyi Yang\",\"Jiayan Teng\",\"ZhenHui Dong\",\"Kairui Wen\",\"Xiaotao Gu\",\"Yong-Jin Liu\",\"Jie Tang\"]","published":"2025-12-05T17:38:55Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\",\"Transformer\"]","has_code":false,"code_links":[{"ID":606265,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2832646,"paper_url":"https://arxiv.org/abs/2512.05905","paper_title":"SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations","repo_url":"https://github.com/zai-org/SCAIL","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
