{"ID":2893255,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.14367","arxiv_id":"2507.14367","title":"Hallucination Score: Towards Mitigating Hallucinations in Generative Image Super-Resolution","abstract":"Generative super-resolution (GSR) currently sets the state-of-the-art in terms of perceptual image quality, overcoming the \"regression-to-the-mean\" blur of prior non-generative models. However, from a human perspective, such models do not fully conform to the optimal balance between quality and fidelity. Instead, a different class of artifacts, in which generated details fail to perceptually match the low resolution image (LRI) or ground-truth image (GTI), is a critical but under-studied issue in GSR, limiting its practical deployment. In this work, we focus on measuring, analyzing, and mitigating these artifacts (i.e., \"hallucinations\"). We observe that hallucinations are not well-characterized with existing image metrics or quality models, as they are orthogonal to both exact fidelity and no-reference quality. Instead, we take advantage of multimodal large language models (MLLMs) by constructing a prompt that assesses hallucinatory visual elements and generates a \"Hallucination Score\" (HS). We find that HS is closely aligned with human evaluations, and also provides complementary insights to prior image metrics used for super-resolution (SR) models. Finally, we propose a few efficient HS proxies and demonstrate how diffusion-based GSR models can be fine-tuned to mitigate hallucinations, leveraging HS proxies as differentiable reward functions.","short_abstract":"Generative super-resolution (GSR) currently sets the state-of-the-art in terms of perceptual image quality, overcoming the \"regression-to-the-mean\" blur of prior non-generative models. However, from a human perspective, such models do not fully conform to the optimal balance between quality and fidelity. Instead, a dif...","url_abs":"https://arxiv.org/abs/2507.14367","url_pdf":"https://arxiv.org/pdf/2507.14367v2","authors":"[\"Weiming Ren\",\"Raghav Goyal\",\"Zhiming Hu\",\"Tristan Ty Aumentado-Armstrong\",\"Iqbal Mohomed\",\"Alex Levinshtein\"]","published":"2025-07-18T21:13:50Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\",\"Large Language Model\",\"Language Model\"]","has_code":false}
