{"ID":2857407,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.09096","arxiv_id":"2510.09096","title":"When a Robot is More Capable than a Human: Learning from Constrained Demonstrators","abstract":"Learning from demonstrations enables experts to teach robots complex tasks using interfaces such as kinesthetic teaching, joystick control, and sim-to-real transfer. However, these interfaces often constrain the expert's ability to demonstrate optimal behavior due to indirect control, setup restrictions, and hardware safety. For example, a joystick can move a robotic arm only in a 2D plane, even though the robot operates in a higher-dimensional space. As a result, the demonstrations collected by constrained experts lead to suboptimal performance of the learned policies. This raises a key question: Can a robot learn a better policy than the one demonstrated by a constrained expert? We address this by allowing the agent to go beyond direct imitation of expert actions and explore shorter and more efficient trajectories. We use the demonstrations to infer a state-only reward signal that measures task progress, and self-label reward for unknown states using temporal interpolation. Our approach outperforms common imitation learning in both sample efficiency and task completion time. On a real WidowX robotic arm, it completes the task in 12 seconds, 10x faster than behavioral cloning, as shown in real-robot videos on https://sites.google.com/view/constrainedexpert .","short_abstract":"Learning from demonstrations enables experts to teach robots complex tasks using interfaces such as kinesthetic teaching, joystick control, and sim-to-real transfer. However, these interfaces often constrain the expert's ability to demonstrate optimal behavior due to indirect control, setup restrictions, and hardware s...","url_abs":"https://arxiv.org/abs/2510.09096","url_pdf":"https://arxiv.org/pdf/2510.09096v3","authors":"[\"Xinhu Li\",\"Ayush Jain\",\"Zhaojing Yang\",\"Yigit Korkmaz\",\"Erdem Bıyık\"]","published":"2025-10-10T07:48:12Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.AI\",\"cs.LG\"]","methods":"[]","project_urls":"[\"https://sites.google.com/view/constrainedexpert\"]","has_code":false,"code_links":[{"ID":608448,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2857407,"paper_url":"https://arxiv.org/abs/2510.09096","paper_title":"When a Robot is More Capable than a Human: Learning from Constrained Demonstrators","repo_url":"https://github.com/google/safevalues","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
