{"ID":2878639,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.18159","arxiv_id":"2508.18159","title":"SpotEdit: Evaluating Visually-Guided Image Editing Methods","abstract":"Visually-guided image editing, where edits are conditioned on both visual cues and textual prompts, has emerged as a powerful paradigm for fine-grained, controllable content generation. Although recent generative models have shown remarkable capabilities, existing evaluations remain simple and insufficiently representative of real-world editing challenges. We present SpotEdit, a comprehensive benchmark designed to systematically assess visually-guided image editing methods across diverse diffusion, autoregressive, and hybrid generative models, uncovering substantial performance disparities. To address a critical yet underexplored challenge, our benchmark includes a dedicated component on hallucination, highlighting how leading models, such as GPT-4o, often hallucinate the existence of a visual cue and erroneously perform the editing task. Our code and benchmark are publicly released at https://github.com/SaraGhazanfari/SpotEdit.","short_abstract":"Visually-guided image editing, where edits are conditioned on both visual cues and textual prompts, has emerged as a powerful paradigm for fine-grained, controllable content generation. Although recent generative models have shown remarkable capabilities, existing evaluations remain simple and insufficiently representa...","url_abs":"https://arxiv.org/abs/2508.18159","url_pdf":"https://arxiv.org/pdf/2508.18159v2","authors":"[\"Sara Ghazanfari\",\"Wei-An Lin\",\"Haitong Tian\",\"Ersin Yumer\"]","published":"2025-08-25T16:08:57Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.LG\"]","methods":"[\"Diffusion Model\"]","has_code":false,"code_links":[{"ID":610501,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2878639,"paper_url":"https://arxiv.org/abs/2508.18159","paper_title":"SpotEdit: Evaluating Visually-Guided Image Editing Methods","repo_url":"https://github.com/SaraGhazanfari/SpotEdit","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}