{"ID":2880720,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.13866","arxiv_id":"2508.13866","title":"SAGA: Learning Signal-Aligned Distributions for Improved Text-to-Image Generation","abstract":"State-of-the-art text-to-image models produce visually impressive results but often struggle with precise alignment to text prompts, leading to missing critical elements or unintended blending of distinct concepts. We propose a novel approach that learns a high-success-rate distribution conditioned on a target prompt, ensuring that generated images faithfully reflect the corresponding prompts. Our method explicitly models the signal component during the denoising process, offering fine-grained control that mitigates over-optimization and out-of-distribution artifacts. Moreover, our framework is training-free and seamlessly integrates with both existing diffusion and flow matching architectures. It also supports additional conditioning modalities -- such as bounding boxes -- for enhanced spatial alignment. Extensive experiments demonstrate that our approach outperforms current state-of-the-art methods. The code is available at https://github.com/grimalPaul/gsn-factory.","short_abstract":"State-of-the-art text-to-image models produce visually impressive results but often struggle with precise alignment to text prompts, leading to missing critical elements or unintended blending of distinct concepts. We propose a novel approach that learns a high-success-rate distribution conditioned on a target prompt,...","url_abs":"https://arxiv.org/abs/2508.13866","url_pdf":"https://arxiv.org/pdf/2508.13866v2","authors":"[\"Paul Grimal\",\"Michaël Soumm\",\"Hervé Le Borgne\",\"Olivier Ferret\",\"Akihiro Sugimoto\"]","published":"2025-08-19T14:31:15Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\"]","has_code":false,"code_links":[{"ID":610704,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2880720,"paper_url":"https://arxiv.org/abs/2508.13866","paper_title":"SAGA: Learning Signal-Aligned Distributions for Improved Text-to-Image Generation","repo_url":"https://github.com/grimalPaul/gsn-factory","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
