{"ID":2880102,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.14413","arxiv_id":"2508.14413","title":"Disentanglement in T-space for Faster and Distributed Training of Diffusion Models with Fewer Latent-states","abstract":"We challenge a fundamental assumption of diffusion models, namely, that a large number of latent-states or time-steps is required for training so that the reverse generative process is close to a Gaussian. We first show that with careful selection of a noise schedule, diffusion models trained over a small number of latent states (i.e. $T \\sim 32$) match the performance of models trained over a much large number of latent states ($T \\sim 1,000$). Second, we push this limit (on the minimum number of latent states required) to a single latent-state, which we refer to as complete disentanglement in T-space. We show that high quality samples can be easily generated by the disentangled model obtained by combining several independently trained single latent-state models. We provide extensive experiments to show that the proposed disentangled model provides 4-6$\\times$ faster convergence measured across a variety of metrics on two different datasets.","short_abstract":"We challenge a fundamental assumption of diffusion models, namely, that a large number of latent-states or time-steps is required for training so that the reverse generative process is close to a Gaussian. We first show that with careful selection of a noise schedule, diffusion models trained over a small number of lat...","url_abs":"https://arxiv.org/abs/2508.14413","url_pdf":"https://arxiv.org/pdf/2508.14413v1","authors":"[\"Samarth Gupta\",\"Raghudeep Gadde\",\"Rui Chen\",\"Aleix M. Martinez\"]","published":"2025-08-20T04:21:26Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.CV\"]","methods":"[\"Diffusion Model\"]","has_code":false}