{"ID":2837187,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.20841","arxiv_id":"2511.20841","title":"OVAL-Grasp: Open-Vocabulary Affordance Localization for Task Oriented Grasping","abstract":"To manipulate objects in novel, unstructured environments, robots need task-oriented grasps that target object parts based on the given task. Geometry-based methods often struggle with visually defined parts, occlusions, and unseen objects. We introduce OVAL-Grasp, a zero-shot open-vocabulary approach to task-oriented, affordance based grasping that uses large-language models and vision-language models to allow a robot to grasp objects at the correct part according to a given task. Given an RGB image and a task, OVAL-Grasp identifies parts to grasp or avoid with an LLM, segments them with a VLM, and generates a 2D heatmap of actionable regions on the object. During our evaluations, we found that our method outperformed two task oriented grasping baselines on experiments with 20 household objects with 3 unique tasks for each. OVAL-Grasp successfully identifies and segments the correct object part 95% of the time and grasps the correct actionable area 78.3% of the time in real-world experiments with the Fetch mobile manipulator. Additionally, OVAL-Grasp finds correct object parts under partial occlusions, demonstrating a part selection success rate of 80% in cluttered scenes. We also demonstrate OVAL-Grasp's efficacy in scenarios that rely on visual features for part selection, and show the benefit of a modular design through our ablation experiments. Our project webpage is available at https://ekjt.github.io/OVAL-Grasp/","short_abstract":"To manipulate objects in novel, unstructured environments, robots need task-oriented grasps that target object parts based on the given task. Geometry-based methods often struggle with visually defined parts, occlusions, and unseen objects. We introduce OVAL-Grasp, a zero-shot open-vocabulary approach to task-oriented,...","url_abs":"https://arxiv.org/abs/2511.20841","url_pdf":"https://arxiv.org/pdf/2511.20841v1","authors":"[\"Edmond Tong\",\"Advaith Balaji\",\"Anthony Opipari\",\"Stanley Lewis\",\"Zhen Zeng\",\"Odest Chadwicke Jenkins\"]","published":"2025-11-25T20:46:31Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
