{"ID":2923182,"CreatedAt":"2026-06-02T03:17:13.356150003Z","UpdatedAt":"2026-06-04T07:41:34.29888543Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.02133","arxiv_id":"2606.02133","title":"Variational Learning for Insertion-based Generation","abstract":"Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive modeling by allowing tokens to be generated in non-fixed and prescribed orders. Despite their practical advantages, most existing non-monotonic models are order-agnostic and rely on a fixed-length grid, limiting their ability to support variable-length generation and adaptive insertion order. In this work, we introduce a probabilistic framework for learning insertion order in variable-length insertion models. We formalize a bijective correspondence between insertion trajectories and permutations, which enables an exact reparameterization of the data likelihood as a sum over permutations. Building on this result, we propose the Insertion Process (IP), a stochastic generative model that jointly learns where to insert, what to insert, and when to terminate, trained via permutation-based variational inference. Unlike prior fixed-canvas approaches, IP natively supports variable-length generation and learns data-driven preferences over insertion orders. Experiments on goal-conditioned planning and molecular string generation demonstrate that learning insertion order improves both modeling quality and generalization in domains without a canonical left-to-right structure.","short_abstract":"Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive modeling by allowing tokens to be generated in non-fixed and prescribed orders. Despite their practical advantages, most existing non-monotonic models are order-agnostic and rely on...","url_abs":"https://arxiv.org/abs/2606.02133","url_pdf":"https://arxiv.org/pdf/2606.02133v1","authors":"[\"Yangtian Zhang\",\"Zhe Wang\",\"Arthur Gretton\",\"Rex Ying\",\"David van Dijk\",\"Michalis K. Titsias\",\"Jiaxin Shi\"]","published":"2026-06-01T11:59:46Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"Diffusion Model\"]","has_code":false}
