{"ID":2885118,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.09178","arxiv_id":"2508.09178","title":"IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection","abstract":"Industrial anomaly detection is a critical component of modern manufacturing, yet the scarcity of defective samples restricts traditional detection methods to scenario-specific applications. Although Vision-Language Models (VLMs) demonstrate significant advantages in generalization capabilities, their performance in industrial anomaly detection remains limited. To address this challenge, we propose IAD-R1, a universal post-training framework applicable to VLMs of different architectures and parameter scales, which substantially enhances their anomaly detection capabilities. IAD-R1 employs a two-stage training strategy: the Perception Activation Supervised Fine-Tuning (PA-SFT) stage utilizes a meticulously constructed high-quality Chain-of-Thought dataset (Expert-AD) for training, enhancing anomaly perception capabilities and establishing reasoning-to-answer correlations; the Structured Control Group Relative Policy Optimization (SC-GRPO) stage employs carefully designed reward functions to achieve a capability leap from \"Anomaly Perception\" to \"Anomaly Interpretation\". Experimental results demonstrate that IAD-R1 achieves significant improvements across 7 VLMs, the largest improvement was on the DAGM dataset, with average accuracy 43.3% higher than the 0.5B baseline. Notably, the 0.5B parameter model trained with IAD-R1 surpasses commercial models including GPT-4.1 and Claude-Sonnet-4 in zero-shot settings, demonstrating the effectiveness and superiority of IAD-R1. The dataset, code, and all model weights will be publicly available at https://github.com/Yanhui-Lee/IAD-R1.","short_abstract":"Industrial anomaly detection is a critical component of modern manufacturing, yet the scarcity of defective samples restricts traditional detection methods to scenario-specific applications. Although Vision-Language Models (VLMs) demonstrate significant advantages in generalization capabilities, their performance in in...","url_abs":"https://arxiv.org/abs/2508.09178","url_pdf":"https://arxiv.org/pdf/2508.09178v2","authors":"[\"Yanhui Li\",\"Yunkang Cao\",\"Chengliang Liu\",\"Yuan Xiong\",\"Xinghui Dong\",\"Chao Huang\"]","published":"2025-08-07T09:34:45Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.AI\"]","methods":"[\"Language Model\"]","has_code":false,"code_links":[{"ID":611151,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2885118,"paper_url":"https://arxiv.org/abs/2508.09178","paper_title":"IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection","repo_url":"https://github.com/Yanhui-Lee/IAD-R1","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
