{"ID":2898139,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.04049","arxiv_id":"2507.04049","title":"DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving","abstract":"Most end-to-end autonomous driving methods rely on imitation learning from single expert demonstrations, often leading to conservative and homogeneous behaviors that limit generalization in complex real-world scenarios. In this work, we propose DIVER, an end-to-end driving framework that integrates reinforcement learning with diffusion-based generation to produce diverse and feasible trajectories. At the core of DIVER lies a reinforced diffusion-based generation mechanism. First, the model conditions on map elements and surrounding agents to generate multiple reference trajectories from a single ground-truth trajectory, alleviating the limitations of imitation learning that arise from relying solely on single expert demonstrations. Second, reinforcement learning is employed to guide the diffusion process, where reward-based supervision enforces safety and diversity constraints on the generated trajectories, thereby enhancing their practicality and generalization capability. Furthermore, to address the limitations of L2-based open-loop metrics in capturing trajectory diversity, we propose a novel Diversity metric to evaluate the diversity of multi-mode predictions.Extensive experiments on the closed-loop NAVSIM and Bench2Drive benchmarks, as well as the open-loop nuScenes dataset, demonstrate that DIVER significantly improves trajectory diversity, effectively addressing the mode collapse problem inherent in imitation learning.","short_abstract":"Most end-to-end autonomous driving methods rely on imitation learning from single expert demonstrations, often leading to conservative and homogeneous behaviors that limit generalization in complex real-world scenarios. In this work, we propose DIVER, an end-to-end driving framework that integrates reinforcement learni...","url_abs":"https://arxiv.org/abs/2507.04049","url_pdf":"https://arxiv.org/pdf/2507.04049v4","authors":"[\"Ziying Song\",\"Lin Liu\",\"Hongyu Pan\",\"Bencheng Liao\",\"Mingzhe Guo\",\"Lei Yang\",\"Yongchang Zhang\",\"Shaoqing Xu\",\"Caiyan Jia\",\"Yadan Luo\"]","published":"2025-07-05T14:19:19Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.RO\"]","methods":"[\"Reinforcement Learning\",\"Diffusion Model\"]","has_code":false}