{"ID":2882284,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.10538","arxiv_id":"2508.10538","title":"MLM: Learning Multi-task Loco-Manipulation Whole-Body Control for Quadruped Robot with Arm","abstract":"Whole-body loco-manipulation for quadruped robots with arms remains a challenging problem, particularly in achieving multi-task control. To address this, we propose MLM, a reinforcement learning framework driven by both real-world and simulation data. It enables a six-DoF robotic arm-equipped quadruped robot to perform whole-body loco-manipulation for multiple tasks autonomously or under human teleoperation. To address the problem of balancing multiple tasks during the learning of loco-manipulation, we introduce a trajectory library with an adaptive, curriculum-based sampling mechanism. This approach allows the policy to efficiently leverage real-world collected trajectories for learning multi-task loco-manipulation. To address deployment scenarios with only historical observations and to enhance the performance of policy execution across tasks with different spatial ranges, we propose a Trajectory-Velocity Prediction policy network. It predicts unobservable future trajectories and velocities. By leveraging extensive simulation data and curriculum-based rewards, our controller achieves whole-body behaviors in simulation and zero-shot transfer to real-world deployment. Ablation studies in simulation verify the necessity and effectiveness of our approach, while real-world experiments on a Go2 robot with an Airbot robotic arm demonstrate the policy's good performance in multi-task execution.","short_abstract":"Whole-body loco-manipulation for quadruped robots with arms remains a challenging problem, particularly in achieving multi-task control. To address this, we propose MLM, a reinforcement learning framework driven by both real-world and simulation data. It enables a six-DoF robotic arm-equipped quadruped robot to perform...","url_abs":"https://arxiv.org/abs/2508.10538","url_pdf":"https://arxiv.org/pdf/2508.10538v2","authors":"[\"Xin Liu\",\"Bida Ma\",\"Chenkun Qi\",\"Yan Ding\",\"Nuo Xu\",\"Zhaxizhuoma\",\"Guorong Zhang\",\"Pengan Chen\",\"Kehui Liu\",\"Zhongjie Jia\",\"Chuyue Guan\",\"Yule Mo\",\"Jiaqi Liu\",\"Feng Gao\",\"Jiangwei Zhong\",\"Bin Zhao\",\"Xuelong Li\"]","published":"2025-08-14T11:18:32Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}