{"ID":2885110,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.05186","arxiv_id":"2508.05186","title":"Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation","abstract":"Recent vision-language-action (VLA) models for multi-task robot manipulation often rely on fixed camera setups and shared visual encoders, which limit their performance under occlusions and during cross-task transfer. To address these challenges, we propose Task-aware Virtual View Exploration (TVVE), a framework that learns to select task-relevant virtual camera viewpoints and dynamically re-render observations from a reconstructed scene representation using the selected viewpoints. To enable efficient view selection, we train an exploration policy in a pseudo-environment. In addition, we introduce a Task-aware Mixture-of-Experts (TaskMoE) visual encoder that routes visual features to task-specialized experts, mitigating interference in multi-task learning. To evaluate robustness under distribution shifts, we construct RLBench-OG, an out-of-distribution benchmark with visual perturbations and camera pose variations. Experiments on RLBench and RLBench-OG demonstrate that TVVE achieves higher success rates than strong baselines, while real-robot experiments further confirm its robustness to visual disturbances and unseen instructions. Code and visualizations are available at: https://hcplab-sysu.github.io/TAVP.","short_abstract":"Recent vision-language-action (VLA) models for multi-task robot manipulation often rely on fixed camera setups and shared visual encoders, which limit their performance under occlusions and during cross-task transfer. To address these challenges, we propose Task-aware Virtual View Exploration (TVVE), a framework that l...","url_abs":"https://arxiv.org/abs/2508.05186","url_pdf":"https://arxiv.org/pdf/2508.05186v5","authors":"[\"Yongjie Bai\",\"Zhouxia Wang\",\"Yang Liu\",\"Kaijun Luo\",\"Yifan Wen\",\"Mingtong Dai\",\"Weixing Chen\",\"Ziliang Chen\",\"Lingbo Liu\",\"Guanbin Li\",\"Liang Lin\"]","published":"2025-08-07T09:21:20Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.CV\"]","methods":"[\"LoRA\"]","has_code":false}
