{"ID":2889069,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.21533","arxiv_id":"2507.21533","title":"Model Predictive Adversarial Imitation Learning for Planning from Observation","abstract":"Human demonstration data is often ambiguous and incomplete, motivating imitation learning approaches that also exhibit reliable planning behavior. A common paradigm to perform planning-from-demonstration involves learning a reward function via Inverse Reinforcement Learning (IRL) then deploying this reward via Model Predictive Control (MPC). Towards unifying these methods, we derive a replacement of the policy in IRL with a planning-based agent. With connections to Adversarial Imitation Learning, this formulation enables end-to-end interactive learning of planners from observation-only demonstrations. In addition to benefits in interpretability, complexity, and safety, we study and observe significant improvements on sample efficiency, out-of-distribution generalization, and robustness. The study includes evaluations in both simulated control benchmarks and real-world navigation experiments using few-to-single observation-only demonstrations.","short_abstract":"Human demonstration data is often ambiguous and incomplete, motivating imitation learning approaches that also exhibit reliable planning behavior. A common paradigm to perform planning-from-demonstration involves learning a reward function via Inverse Reinforcement Learning (IRL) then deploying this reward via Model Pr...","url_abs":"https://arxiv.org/abs/2507.21533","url_pdf":"https://arxiv.org/pdf/2507.21533v2","authors":"[\"Tyler Han\",\"Yanda Bao\",\"Bhaumik Mehta\",\"Gabriel Guo\",\"Anubhav Vishwakarma\",\"Emily Kang\",\"Sanghun Jung\",\"Rosario Scalise\",\"Jason Zhou\",\"Bryan Xu\",\"Byron Boots\"]","published":"2025-07-29T06:52:52Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.AI\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
