{"ID":2834994,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.01061","arxiv_id":"2512.01061","title":"Opening the Sim-to-Real Door for Humanoid Pixel-to-Action Policy Transfer","abstract":"Recent progress in GPU-accelerated, photorealistic simulation has opened a scalable data-generation path for robot learning, where massive physics and visual randomization allow policies to generalize beyond curated environments. Building on these advances, we develop a teacher-student-bootstrap learning framework for vision-based humanoid loco-manipulation, using articulated-object interaction as a representative high-difficulty benchmark. Our approach introduces a staged-reset exploration strategy that stabilizes long-horizon privileged-policy training, and a GRPO-based fine-tuning procedure that mitigates partial observability and improves closed-loop consistency in sim-to-real RL. Trained entirely on simulation data, the resulting policy achieves robust zero-shot performance across diverse door types and outperforms human teleoperators by up to 31.7% in task completion time under the same whole-body control stack. This represents the first humanoid sim-to-real policy capable of diverse articulated loco-manipulation using pure RGB perception.","short_abstract":"Recent progress in GPU-accelerated, photorealistic simulation has opened a scalable data-generation path for robot learning, where massive physics and visual randomization allow policies to generalize beyond curated environments. Building on these advances, we develop a teacher-student-bootstrap learning framework for...","url_abs":"https://arxiv.org/abs/2512.01061","url_pdf":"https://arxiv.org/pdf/2512.01061v1","authors":"[\"Haoru Xue\",\"Tairan He\",\"Zi Wang\",\"Qingwei Ben\",\"Wenli Xiao\",\"Zhengyi Luo\",\"Xingye Da\",\"Fernando Castañeda\",\"Guanya Shi\",\"Shankar Sastry\",\"Linxi \\\"Jim\\\" Fan\",\"Yuke Zhu\"]","published":"2025-11-30T20:07:13Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.CV\"]","methods":"[\"LoRA\"]","has_code":false}
