{"ID":2829545,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.12302","arxiv_id":"2512.12302","title":"From Human Intention to Action Prediction: Intention-Driven End-to-End Autonomous Driving","abstract":"While end-to-end autonomous driving has achieved remarkable progress in geometric control, current systems remain constrained by a command-following paradigm that relies on simple navigational instructions. Transitioning to genuinely intelligent agents requires the capability to interpret and fulfill high-level, abstract human intentions. However, this advancement is hindered by the lack of dedicated benchmarks and semantic-aware evaluation metrics. In this paper, we formally define the task of Intention-Driven End-to-End Autonomous Driving and present Intention-Drive, a comprehensive benchmark designed to bridge this gap. We construct a large-scale dataset featuring complex natural language intentions paired with high-fidelity sensor data. To overcome the limitations of conventional trajectory-based metrics, we introduce the Imagined Future Alignment (IFA), a novel evaluation protocol leveraging generative world models to assess the semantic fulfillment of human goals beyond mere geometric accuracy. Furthermore, we explore the solution space by proposing two distinct paradigms: an end-to-end vision-language planner and a hierarchical agent-based framework. The experiments reveal a critical dichotomy where existing models exhibit satisfactory driving stability but struggle significantly with intention fulfillment. Notably, the proposed frameworks demonstrate superior alignment with human intentions.","short_abstract":"While end-to-end autonomous driving has achieved remarkable progress in geometric control, current systems remain constrained by a command-following paradigm that relies on simple navigational instructions. Transitioning to genuinely intelligent agents requires the capability to interpret and fulfill high-level, abstra...","url_abs":"https://arxiv.org/abs/2512.12302","url_pdf":"https://arxiv.org/pdf/2512.12302v2","authors":"[\"Huan Zheng\",\"Yucheng Zhou\",\"Tianyi Yan\",\"Jiayi Su\",\"Hongjun Chen\",\"Dubing Chen\",\"Xingtai Gui\",\"Wencheng Han\",\"Runzhou Tao\",\"Zhongying Qiu\",\"Jianfei Yang\",\"Jianbing Shen\"]","published":"2025-12-13T11:59:51Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.CL\",\"cs.RO\"]","methods":"[\"Large Language Model\"]","has_code":false}