{"ID":2870665,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.13579","arxiv_id":"2509.13579","title":"TreeIRL: Safe Urban Driving with Tree Search and Inverse Reinforcement Learning","abstract":"We present TreeIRL, a novel planner for autonomous driving that combines Monte Carlo tree search (MCTS) and inverse reinforcement learning (IRL) to achieve state-of-the-art performance in simulation and in real-world driving. The core idea is to use MCTS to find a promising set of safe candidate trajectories and a deep IRL scoring function to select the most human-like among them. We evaluate TreeIRL against both classical and state-of-the-art planners in large-scale simulations and on 500+ miles of real-world autonomous driving in the Las Vegas metropolitan area. Test scenarios include dense urban traffic, adaptive cruise control, cut-ins, and traffic lights. TreeIRL achieves the best overall performance, striking a balance between safety, progress, comfort, and human-likeness. To our knowledge, our work is the first demonstration of MCTS-based planning on public roads and underscores the importance of evaluating planners across a diverse set of metrics and in real-world environments. TreeIRL is highly extensible and could be further improved with reinforcement learning and imitation learning, providing a framework for exploring different combinations of classical and learning-based approaches to solve the planning bottleneck in autonomous driving.","short_abstract":"We present TreeIRL, a novel planner for autonomous driving that combines Monte Carlo tree search (MCTS) and inverse reinforcement learning (IRL) to achieve state-of-the-art performance in simulation and in real-world driving. The core idea is to use MCTS to find a promising set of safe candidate trajectories and a deep...","url_abs":"https://arxiv.org/abs/2509.13579","url_pdf":"https://arxiv.org/pdf/2509.13579v4","authors":"[\"Momchil S. Tomov\",\"Sang Uk Lee\",\"Hansford Hendrago\",\"Jinwook Huh\",\"Teawon Han\",\"Forbes Howington\",\"Rafael da Silva\",\"Gianmarco Bernasconi\",\"Marc Heim\",\"Samuel Findler\",\"Xiaonan Ji\",\"Alexander Boule\",\"Michael Napoli\",\"Kuo Chen\",\"Jesse Miller\",\"Boaz Floor\",\"Yunqing Hu\"]","published":"2025-09-16T22:37:37Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
