{"ID":2923651,"CreatedAt":"2026-06-02T04:05:25.881865328Z","UpdatedAt":"2026-06-04T13:12:39.622923895Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.02277","arxiv_id":"2606.02277","title":"RoboSemanticBench: Diagnosing Semantic Grounding in Action Prediction for VLA Models","abstract":"Vision-language-action (VLA) models are built on the premise that semantic understanding from pretrained language or vision-language backbones should guide robot action prediction. Yet robot fine-tuning is optimized as imitation over task-specific action distributions, and many evaluations can be solved through visual or instruction-action shortcuts. We introduce RoboSemanticBench (RSB), an embodied benchmark for diagnosing semantic grounding in action prediction: whether post-trained VLA models can use complex instruction semantics to select and manipulate the correct physical target. In each episode, a robot receives a multiple-choice math or general-knowledge question, observes candidate answer blocks, and must grasp the block corresponding to the correct answer. RSB covers controlled arithmetic, grade-school mathematical understanding, and commonsense or factual understanding under four-choice and ten-choice suites. Across representative VLA models, we find that many policies learn to grasp candidate blocks but select the semantically correct block at near-random or below-random rates after controlling for grasp success, revealing a persistent gap between backbone-level semantic competence and action prediction.","short_abstract":"Vision-language-action (VLA) models are built on the premise that semantic understanding from pretrained language or vision-language backbones should guide robot action prediction. Yet robot fine-tuning is optimized as imitation over task-specific action distributions, and many evaluations can be solved through visual...","url_abs":"https://arxiv.org/abs/2606.02277","url_pdf":"https://arxiv.org/pdf/2606.02277v1","authors":"[\"Bin Yu\",\"Yao Zhang\",\"Haishan Liu\",\"Shijie Lian\",\"Yuliang Wei\",\"Xiaopeng Lin\",\"Zhaolong Shen\",\"Changti Wu\",\"Ruina Hu\",\"Bailing Wang\",\"Cong Huang\",\"Kai Chen\"]","published":"2026-06-01T14:02:37Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[]","has_code":false}
