{"ID":2860453,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.03597","arxiv_id":"2510.03597","title":"Neon: Negative Extrapolation From Self-Training Improves Image Generation","abstract":"Scaling generative AI models is bottlenecked by the scarcity of high-quality training data. The ease of synthesizing from a generative model suggests using (unverified) synthetic data to augment a limited corpus of real data for the purpose of fine-tuning in the hope of improving performance. Unfortunately, however, the resulting positive feedback loop leads to model autophagy disorder (MAD, aka model collapse) that results in a rapid degradation in sample quality and/or diversity. In this paper, we introduce Neon (for Negative Extrapolation frOm self-traiNing), a new learning method that turns the degradation from self-training into a powerful signal for self-improvement. Given a base model, Neon first fine-tunes it on its own self-synthesized data but then, counterintuitively, reverses its gradient updates to extrapolate away from the degraded weights. We prove that Neon works because typical inference samplers that favor high-probability regions create a predictable anti-alignment between the synthetic and real data population gradients, which negative extrapolation corrects to better align the model with the true data distribution. Neon is remarkably easy to implement via a simple post-hoc merge that requires no new real data, works effectively with as few as 1k synthetic samples, and typically uses less than 1% additional training compute. We demonstrate Neon's universality across a range of architectures (diffusion, flow matching, autoregressive, and inductive moment matching models) and datasets (ImageNet, CIFAR-10, and FFHQ). In particular, on ImageNet 256x256, Neon elevates the xAR-L model to a new state-of-the-art FID of 1.02 with only 0.36% additional training compute. Code is available at https://github.com/VITA-Group/Neon","short_abstract":"Scaling generative AI models is bottlenecked by the scarcity of high-quality training data. The ease of synthesizing from a generative model suggests using (unverified) synthetic data to augment a limited corpus of real data for the purpose of fine-tuning in the hope of improving performance. Unfortunately, however, th...","url_abs":"https://arxiv.org/abs/2510.03597","url_pdf":"https://arxiv.org/pdf/2510.03597v3","authors":"[\"Sina Alemohammad\",\"Zhangyang Wang\",\"Richard G. Baraniuk\"]","published":"2025-10-04T01:20:30Z","proceeding":"cs.GR","tasks":"[\"cs.GR\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Diffusion Model\"]","has_code":false,"code_links":[{"ID":608731,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2860453,"paper_url":"https://arxiv.org/abs/2510.03597","paper_title":"Neon: Negative Extrapolation From Self-Training Improves Image Generation","repo_url":"https://github.com/VITA-Group/Neon","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}