{"ID":2869737,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.13774","arxiv_id":"2509.13774","title":"Dual-Actor Fine-Tuning of VLA Models: A Talk-and-Tweak Human-in-the-Loop Approach","abstract":"Vision-language-action (VLA) models demonstrate strong generalization in robotic manipulation but face challenges in complex, real-world tasks. While supervised fine-tuning with demonstrations is constrained by data quality, reinforcement learning (RL) offers a promising alternative. We propose a human-in-the-loop dual-actor fine-tuning framework grounded in RL. The framework integrates a primary actor for robust multi-task performance with a refinement actor for latent-space adaptation. Beyond standard physical interventions, we introduce a lightweight talk-and-tweak scheme that converts human corrections into semantically grounded language commands, thereby generating a new dataset for policy learning. In real-world multi-task experiments, our approach achieves 100% success across three tasks within 101 minutes of online fine-tuning. For long-horizon tasks, it sustains a 50% success rate over 12 consecutive operations. Furthermore, the framework scales effectively to multi-robot training, achieving up to a 2 times improvement in efficiency when using dual robots. The experiment videos are available at https://sites.google.com/view/hil-daft/.","short_abstract":"Vision-language-action (VLA) models demonstrate strong generalization in robotic manipulation but face challenges in complex, real-world tasks. While supervised fine-tuning with demonstrations is constrained by data quality, reinforcement learning (RL) offers a promising alternative. We propose a human-in-the-loop dual...","url_abs":"https://arxiv.org/abs/2509.13774","url_pdf":"https://arxiv.org/pdf/2509.13774v1","authors":"[\"Piaopiao Jin\",\"Qi Wang\",\"Guokang Sun\",\"Ziwen Cai\",\"Pinjia He\",\"Yangwei You\"]","published":"2025-09-17T07:44:59Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"Reinforcement Learning\"]","project_urls":"[\"https://sites.google.com/view/hil-daft/\"]","has_code":false,"code_links":[{"ID":609715,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2869737,"paper_url":"https://arxiv.org/abs/2509.13774","paper_title":"Dual-Actor Fine-Tuning of VLA Models: A Talk-and-Tweak Human-in-the-Loop Approach","repo_url":"https://github.com/google/safevalues","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
