{"ID":2826873,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2602.03850","arxiv_id":"2602.03850","title":"WebAccessVL: Violation-Aware VLM for Web Accessibility","abstract":"We present a vision-language model (VLM) that automatically edits website HTML to address violations of the Web Content Accessibility Guidelines 2 (WCAG2) while preserving the original design. We formulate this as a supervised image-conditioned program synthesis task, where the model learns to correct HTML given both the code and its visual rendering. We create WebAccessVL, a website dataset with manually corrected accessibility violations. We then propose a violation-conditioned VLM that further takes the detected violations' descriptions from a checker as input. This conditioning enables an iterative checker-in-the-loop refinement strategy at test time. We conduct extensive evaluation on both open API and open-weight models. Empirically, our method achieves 0.211 violations per website, a 96.0\\% reduction from the 5.34 violations in raw data and 87\\% better than GPT-5. A perceptual study also confirms that our edited websites better maintain the original visual appearance and content.","short_abstract":"We present a vision-language model (VLM) that automatically edits website HTML to address violations of the Web Content Accessibility Guidelines 2 (WCAG2) while preserving the original design. We formulate this as a supervised image-conditioned program synthesis task, where the model learns to correct HTML given both t...","url_abs":"https://arxiv.org/abs/2602.03850","url_pdf":"https://arxiv.org/pdf/2602.03850v3","authors":"[\"Amber Yijia Zheng\",\"Jae Joong Lee\",\"Bedrich Benes\",\"Raymond A. Yeh\"]","published":"2025-12-19T01:53:22Z","proceeding":"cs.HC","tasks":"[\"cs.HC\",\"cs.AI\",\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false}
