{"ID":2847505,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.27266","arxiv_id":"2510.27266","title":"Enhancing Trustworthy GUI Grounding via Self-Critiqued Reinforcement Learning","abstract":"Autonomous graphical user interface (GUI) agents rely on accurate GUI grounding, which maps language instructions to on-screen coordinates, to execute user commands. However, current models, whether trained via supervised fine-tuning (SFT) or reinforcement learning (RL), often provide confidence signals that are poorly aligned with actual grounding correctness, leading to overconfident and unreliable predictions. To address this, we propose HyperClick, a novel framework that enhances trustworthy GUI grounding through self-critiqued reinforcement learning (SCRL). HyperClick combines a correctness reward and a confidence alignment reward, training the policy model to output both a click prediction and an explicit confidence estimate. This approach jointly optimizes grounding accuracy and confidence reliability through confidence-based self-assessment. Extensive experiments on challenging benchmarks show that HyperClick maintains strong grounding performance while providing better-aligned confidence estimates. By exposing uncertainty alongside GUI actions, HyperClick supports confidence-based abstention in GUI automation. Code will be released here.","short_abstract":"Autonomous graphical user interface (GUI) agents rely on accurate GUI grounding, which maps language instructions to on-screen coordinates, to execute user commands. However, current models, whether trained via supervised fine-tuning (SFT) or reinforcement learning (RL), often provide confidence signals that are poorly...","url_abs":"https://arxiv.org/abs/2510.27266","url_pdf":"https://arxiv.org/pdf/2510.27266v2","authors":"[\"Shaojie Zhang\",\"Pei Fu\",\"Ruoceng Zhang\",\"Jiahui Yang\",\"Anan Du\",\"Xiuwen Xi\",\"Shaokang Wang\",\"Ying Huang\",\"Bin Qin\",\"Zhenbo Luo\",\"Jian Luan\"]","published":"2025-10-31T08:07:02Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
