{"ID":2885372,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.05614","arxiv_id":"2508.05614","title":"GroundAct: Can LLM Agents Ground Actions in Environmental States?","abstract":"LLM agents achieve 85-96% success on tasks where instructions fully specify the action, but drop to 29-53% when action feasibility depends on environmental state that the instruction does not mention. We argue that this gap reflects a missing capability: action grounding, the ability to infer from structured environmental state whether an action is feasible, what prerequisites it lacks, and whether it exceeds individual capacity. We introduce GroundAct, a benchmark of 1,500 scenarios and 16,592 task instances in text-based interactive environments spanning 11 domains, with tasks organized into seven categories along a cognitive complexity hierarchy. Evaluating 15 LLMs (3B-671B), we find three diagnostic patterns: (i) attribute reasoning is weakly correlated with tool and coordination reasoning, producing distinct model profiles; (ii) complete environment graphs yield up to +27.6/-22.9% on tool use vs. implicit collaboration, separating search-bound from constraint-filtering bottlenecks; and (iii) supervised fine-tuning lifts Qwen2.5-3B from 0.6% to 76.3% on direct command but only 1.5% to 5.5% on implicit collaboration. These results establish action grounding as a multi-dimensional challenge irreducible to scaling.","short_abstract":"LLM agents achieve 85-96% success on tasks where instructions fully specify the action, but drop to 29-53% when action feasibility depends on environmental state that the instruction does not mention. We argue that this gap reflects a missing capability: action grounding, the ability to infer from structured environmen...","url_abs":"https://arxiv.org/abs/2508.05614","url_pdf":"https://arxiv.org/pdf/2508.05614v2","authors":"[\"Zixuan Wang\",\"Dingming Li\",\"Hongxing Li\",\"Yanrui Miao\",\"Shuo Chen\",\"Yuchen Yan\",\"Wenqi Zhang\",\"Yongliang Shen\",\"Weiming Lu\",\"Jun Xiao\",\"Yueting Zhuang\"]","published":"2025-08-07T17:54:15Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Generative Adversarial Network\"]","has_code":false}
