{"ID":2860073,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.05307","arxiv_id":"2510.05307","title":"When Should Users Check? Modeling Confirmation Frequency inMulti-Step Agentic AI Tasks","abstract":"Existing AI agents typically execute multi-step tasks autonomously and only allow user confirmation at the end. During execution, users have little control, making the confirm-at-end approach brittle: a single error can cascade and force a complete restart. Confirming every step avoids such failures, but imposes tedious overhead. Balancing excessive interruptions against costly rollbacks remains an open challenge. We address this problem by modeling confirmation as a minimum time scheduling problem. We conducted a formative study with eight participants, which revealed a recurring Confirmation-Diagnosis-Correction-Redo (CDCR) pattern in how users monitor errors. Based on this pattern, we developed a decision-theoretic model to determine time-efficient confirmation point placement. We then evaluated our approach using a within-subjects study where 48 participants monitored AI agents and repaired their mistakes while executing tasks. Results show that 81 percent of participants preferred our intermediate confirmation approach over the confirm-at-end approach used by existing systems, and task completion time was reduced by 13.54 percent.","short_abstract":"Existing AI agents typically execute multi-step tasks autonomously and only allow user confirmation at the end. During execution, users have little control, making the confirm-at-end approach brittle: a single error can cascade and force a complete restart. Confirming every step avoids such failures, but imposes tediou...","url_abs":"https://arxiv.org/abs/2510.05307","url_pdf":"https://arxiv.org/pdf/2510.05307v3","authors":"[\"Jieyu Zhou\",\"Aryan Roy\",\"Sneh Gupta\",\"Daniel Weitekamp\",\"Christopher J. MacLellan\"]","published":"2025-10-06T19:18:56Z","proceeding":"cs.HC","tasks":"[\"cs.HC\"]","methods":"[]","has_code":false}
