{"ID":2834854,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00814","arxiv_id":"2512.00814","title":"IRPO: Boosting Image Restoration via Post-training GRPO","abstract":"Post-training has become effective for high-level generation, but its role in low-level vision remains underexplored. Existing image restoration methods often rely on fixed pixel-wise fitting to ground-truth images, which can lead to over-smoothing and weak generalization. We propose IRPO, a GRPO-based post-training framework for deterministic restoration models. IRPO is built around two axes: data formulation and reward modeling. For data formulation, we select the 30% underperforming samples from the pre-training stage, which improves both accuracy and training efficiency. For reward modeling, we combine fidelity-oriented and quality-aware feedback with three components: a General Reward for structural fidelity, an Expert Reward that uses a Vision-Language Model as a coarse visual-quality judge, and a Restoration Reward for task-specific low-level cues. Experiments on six in-domain and five out-of-domain (OOD) benchmarks show that IRPO improves the AdaIR baseline by 0.93 dB on in-domain tasks and 3.43 dB on OOD settings. Our code can be shown in https://github.com/HaoxuanXU1024/IRPO.","short_abstract":"Post-training has become effective for high-level generation, but its role in low-level vision remains underexplored. Existing image restoration methods often rely on fixed pixel-wise fitting to ground-truth images, which can lead to over-smoothing and weak generalization. We propose IRPO, a GRPO-based post-training fr...","url_abs":"https://arxiv.org/abs/2512.00814","url_pdf":"https://arxiv.org/pdf/2512.00814v3","authors":"[\"Haoxuan Xu\",\"Yi Liu\",\"Tianfu Li\",\"Ruolin Shen\",\"Boyuan Jiang\",\"Jinlong Peng\",\"Donghao Luo\",\"Xiaobin Hu\",\"Shuicheng Yan\",\"Haoang Li\"]","published":"2025-11-30T09:42:24Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false,"code_links":[{"ID":606452,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2834854,"paper_url":"https://arxiv.org/abs/2512.00814","paper_title":"IRPO: Boosting Image Restoration via Post-training GRPO","repo_url":"https://github.com/HaoxuanXU1024/IRPO","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}