{"ID":2865283,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.22244","arxiv_id":"2509.22244","title":"FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing","abstract":"Text-guided image editing with diffusion models has achieved remarkable quality but often suffers from prohibitive latency. We introduce \\textbf{FlashEdit}, a real-time localized image editing framework for the standard inversion-based editing setting. Its efficiency and precision stem from three key innovations: (1) a \\textbf{Cycle-Consistent One-Step Inversion (COSI)} pipeline that encourages manifold-aligned one-step inversion through cycle consistency; (2) a \\textbf{Background Shield (BG-Shield)} technique that improves preservation of non-edited regions via structural self-attention intervention; and (3) a \\textbf{Sparsified Spatial Cross-Attention (SSCA)} mechanism that promotes precise edits by suppressing semantic leakage. Experiments on PIE-Bench demonstrate a strong preservation-efficiency trade-off, with edits completed in under 0.2 seconds and an over 150$\\times$ speedup over DDIM-based multi-step editing. Our code will be made publicly available at \\url{https://github.com/JunyiWuCode/FlashEdit}.","short_abstract":"Text-guided image editing with diffusion models has achieved remarkable quality but often suffers from prohibitive latency. We introduce \\textbf{FlashEdit}, a real-time localized image editing framework for the standard inversion-based editing setting. Its efficiency and precision stem from three key innovations: (1) a...","url_abs":"https://arxiv.org/abs/2509.22244","url_pdf":"https://arxiv.org/pdf/2509.22244v6","authors":"[\"Junyi Wu\",\"Zhiteng Li\",\"Haotong Qin\",\"Yulun Zhang\",\"Xiaokang Yang\"]","published":"2025-09-26T11:59:30Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\"]","has_code":false,"code_links":[{"ID":609252,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2865283,"paper_url":"https://arxiv.org/abs/2509.22244","paper_title":"FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing","repo_url":"https://github.com/JunyiWuCode/FlashEdit","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}