{"ID":2829035,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.13492","arxiv_id":"2512.13492","title":"Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10$\\times$","abstract":"Native 4K (2160$\\times$3840) video generation remains a critical challenge due to the quadratic computational explosion of full-attention as spatiotemporal resolution increases, making it difficult for models to strike a balance between efficiency and quality. This paper proposes a novel Transformer retrofit strategy termed $\\textbf{T3}$ ($\\textbf{T}$ransform $\\textbf{T}$rained $\\textbf{T}$ransformer) that, without altering the core architecture of full-attention pretrained models, significantly reduces compute requirements by optimizing their forward logic. Specifically, $\\textbf{T3-Video}$ introduces a multi-scale weight-sharing window attention mechanism and, via hierarchical blocking together with an axis-preserving full-attention design, can effect an \"attention pattern\" transformation of a pretrained model using only modest compute and data. Results on 4K-VBench show that $\\textbf{T3-Video}$ substantially outperforms existing approaches: while delivering performance improvements (+4.29$\\uparrow$ VQA and +0.08$\\uparrow$ VTC), it accelerates native 4K video generation by more than 10$\\times$. Project page at https://zhangzjn.github.io/projects/T3-Video","short_abstract":"Native 4K (2160$\\times$3840) video generation remains a critical challenge due to the quadratic computational explosion of full-attention as spatiotemporal resolution increases, making it difficult for models to strike a balance between efficiency and quality. This paper proposes a novel Transformer retrofit strategy t...","url_abs":"https://arxiv.org/abs/2512.13492","url_pdf":"https://arxiv.org/pdf/2512.13492v1","authors":"[\"Jiangning Zhang\",\"Junwei Zhu\",\"Teng Hu\",\"Yabiao Wang\",\"Donghao Luo\",\"Weijian Cao\",\"Zhenye Gan\",\"Xiaobin Hu\",\"Zhucun Xue\",\"Chengjie Wang\"]","published":"2025-12-15T16:25:39Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Transformer\"]","has_code":false}
