{"ID":2865820,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.20843","arxiv_id":"2509.20843","title":"MTRDrive: Memory-Tool Synergistic Reasoning for Robust Autonomous Driving in Corner Cases","abstract":"Vision-Language Models(VLMs) have demonstrated significant potential for end-to-end autonomous driving, yet a substantial gap remains between their current capabilities and the reliability necessary for real-world deployment. A critical challenge is their fragility, characterized by hallucinations and poor generalization in out-of-distribution (OOD) scenarios. To bridge this gap, we introduce MTRDrive, a novel framework that integrates procedural driving experiences with a dynamic toolkit to enhance generalization and proactive decision-making. MTRDrive addresses these limitations through a closed-loop system that combines a memory-based experience retrieval mechanism with dynamic toolkits. This synergy enables the model to interact more effectively with its environment, improving both reasoning and decision-making capabilities with the help of our memory-tool synergistic reasoning. Additionally, we introduce a new benchmark based on complex Roadwork construction scenarios to rigorously evaluate zero-shot generalization. Extensive experiments demonstrate the superior effectiveness of our approach. On the public NAVSIM benchmark, our 3B-parameter MTRDrive model achieves an exceptional PDMS of 88.3 without chain-of-thought and sets a state-of-the-art performance bar on high-level planning, with a driving metric score of 79.8\\% and a planning accuracy of 82.6\\%. Rigorous zero-shot evaluation on the new Roadwork-VLM benchmark shows a strong ability to reason robustly in unseen scenarios, achieving a driving metric score of 80.2\\%. These results highlight MTRDrive's potential to advance autonomous driving toward safer and more reliable systems.","short_abstract":"Vision-Language Models(VLMs) have demonstrated significant potential for end-to-end autonomous driving, yet a substantial gap remains between their current capabilities and the reliability necessary for real-world deployment. A critical challenge is their fragility, characterized by hallucinations and poor generalizati...","url_abs":"https://arxiv.org/abs/2509.20843","url_pdf":"https://arxiv.org/pdf/2509.20843v1","authors":"[\"Ziang Luo\",\"Kangan Qian\",\"Jiahua Wang\",\"Yuechen Luo\",\"Jinyu Miao\",\"Zheng Fu\",\"Yunlong Wang\",\"Sicong Jiang\",\"Zilin Huang\",\"Yifei Hu\",\"Yuhao Yang\",\"Hao Ye\",\"Mengmeng Yang\",\"Xiaojian Dong\",\"Kun Jiang\",\"Diange Yang\"]","published":"2025-09-25T07:31:27Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"Language Model\"]","has_code":false}