{"ID":2846560,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.01331","arxiv_id":"2511.01331","title":"RobustVLA: Robustness-Aware Reinforcement Post-Training for Vision-Language-Action Models","abstract":"Vision-Language-Action (VLA) models have recently emerged as powerful general-purpose policies for robotic manipulation, benefiting from large-scale multi-modal pre-training. However, they often fail to generalize reliably in out-of-distribution deployments, where unavoidable disturbances such as observation noise, sensor errors, or actuation perturbations become prevalent. While recent Reinforcement Learning (RL)-based post-training provides a practical means to adapt pre-trained VLA models, existing methods mainly emphasize reward maximization and overlook robustness to environmental uncertainty. In this work, we introduce RobustVLA, a lightweight online RL post-training method designed to explicitly enhance the resilience of VLA models. Through a systematic robustness analysis, we identify two key regularizations: Jacobian regularization, which mitigates sensitivity to observation noise, and smoothness regularization, which stabilizes policies under action perturbations. Extensive experiments across diverse robotic environments demonstrate that RobustVLA significantly outperforms prior state-of-the-art methods in robustness and reliability. Our results highlight the importance of principled robustness-aware RL post-training as a key step toward improving the reliability and robustness of VLA models.","short_abstract":"Vision-Language-Action (VLA) models have recently emerged as powerful general-purpose policies for robotic manipulation, benefiting from large-scale multi-modal pre-training. However, they often fail to generalize reliably in out-of-distribution deployments, where unavoidable disturbances such as observation noise, sen...","url_abs":"https://arxiv.org/abs/2511.01331","url_pdf":"https://arxiv.org/pdf/2511.01331v2","authors":"[\"Hongyin Zhang\",\"Shuo Zhang\",\"Junxi Jin\",\"Qixin Zeng\",\"Runze Li\",\"Donglin Wang\"]","published":"2025-11-03T08:30:48Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}