{"ID":2892769,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.14503","arxiv_id":"2507.14503","title":"Generative Distribution Distillation","abstract":"In this paper, we formulate the knowledge distillation (KD) as a conditional generative problem and propose the \\textit{Generative Distribution Distillation (GenDD)} framework. A naive \\textit{GenDD} baseline encounters two major challenges: the curse of high-dimensional optimization and the lack of semantic supervision from labels. To address these issues, we introduce a \\textit{Split Tokenization} strategy, achieving stable and effective unsupervised KD. Additionally, we develop the \\textit{Distribution Contraction} technique to integrate label supervision into the reconstruction objective. Our theoretical proof demonstrates that \\textit{GenDD} with \\textit{Distribution Contraction} serves as a gradient-level surrogate for multi-task learning, realizing efficient supervised training without explicit classification loss on multi-step sampling image representations. To evaluate the effectiveness of our method, we conduct experiments on balanced, imbalanced, and unlabeled data. Experimental results show that \\textit{GenDD} performs competitively in the unsupervised setting, significantly surpassing KL baseline by \\textbf{16.29\\%} on ImageNet validation set. With label supervision, our ResNet-50 achieves \\textbf{82.28\\%} top-1 accuracy on ImageNet in 600 epochs training, establishing a new state-of-the-art.","short_abstract":"In this paper, we formulate the knowledge distillation (KD) as a conditional generative problem and propose the \\textit{Generative Distribution Distillation (GenDD)} framework. A naive \\textit{GenDD} baseline encounters two major challenges: the curse of high-dimensional optimization and the lack of semantic supervisio...","url_abs":"https://arxiv.org/abs/2507.14503","url_pdf":"https://arxiv.org/pdf/2507.14503v1","authors":"[\"Jiequan Cui\",\"Beier Zhu\",\"Qingshan Xu\",\"Xiaogang Xu\",\"Pengguang Chen\",\"Xiaojuan Qi\",\"Bei Yu\",\"Hanwang Zhang\",\"Richang Hong\"]","published":"2025-07-19T06:27:42Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.CV\"]","methods":"[]","has_code":false}
