{"ID":2838555,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.17138","arxiv_id":"2511.17138","title":"One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution","abstract":"Recent advances in diffusion-based real-world image super-resolution (Real-ISR) have demonstrated remarkable perceptual quality, yet the balance between fidelity and controllability remains a problem: multi-step diffusion-based methods suffer from generative diversity and randomness, resulting in low fidelity, while one-step methods lose control flexibility due to fidelity-specific finetuning. In this paper, we present ODTSR, a one-step diffusion transformer based on Qwen-Image that performs Real-ISR considering fidelity and controllability simultaneously: a newly introduced visual stream receives low-quality images (LQ) with adjustable noise (Control Noise), and the original visual stream receives LQs with consistent noise (Prior Noise), forming the Noise-hybrid Visual Stream (NVS) design. ODTSR further employs Fidelity-aware Adversarial Training (FAA) to enhance controllability and achieve one-step inference. Extensive experiments demonstrate that ODTSR not only achieves state-of-the-art (SOTA) performance on generic Real-ISR, but also enables prompt controllability on challenging scenarios such as real-world scene text image super-resolution (STISR) of Chinese characters without training on specific datasets. Codes are available at https://github.com/RedMediaTech/ODTSR.","short_abstract":"Recent advances in diffusion-based real-world image super-resolution (Real-ISR) have demonstrated remarkable perceptual quality, yet the balance between fidelity and controllability remains a problem: multi-step diffusion-based methods suffer from generative diversity and randomness, resulting in low fidelity, while on...","url_abs":"https://arxiv.org/abs/2511.17138","url_pdf":"https://arxiv.org/pdf/2511.17138v3","authors":"[\"Yushun Fang\",\"Yuxiang Chen\",\"Shibo Yin\",\"Qiang Hu\",\"Jiangchao Yao\",\"Ya Zhang\",\"Xiaoyun Zhang\",\"Yanfeng Wang\"]","published":"2025-11-21T11:00:59Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\",\"Transformer\"]","has_code":false,"code_links":[{"ID":606788,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2838555,"paper_url":"https://arxiv.org/abs/2511.17138","paper_title":"One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution","repo_url":"https://github.com/RedMediaTech/ODTSR","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
