{"ID":2834610,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.01882","arxiv_id":"2512.01882","title":"New Spiking Architecture for Multi-Modal Decision-Making in Autonomous Vehicles","abstract":"This work proposes an end-to-end multi-modal reinforcement learning framework for high-level decision-making in autonomous vehicles. The framework integrates heterogeneous sensory input, including camera images, LiDAR point clouds, and vehicle heading information, through a cross-attention transformer-based perception module. Although transformers have become the backbone of modern multi-modal architectures, their high computational cost limits their deployment in resource-constrained edge environments. To overcome this challenge, we propose a spiking temporal-aware transformer-like architecture that uses ternary spiking neurons for computationally efficient multi-modal fusion. Comprehensive evaluations across multiple tasks in the Highway Environment demonstrate the effectiveness and efficiency of the proposed approach for real-time autonomous decision-making.","short_abstract":"This work proposes an end-to-end multi-modal reinforcement learning framework for high-level decision-making in autonomous vehicles. The framework integrates heterogeneous sensory input, including camera images, LiDAR point clouds, and vehicle heading information, through a cross-attention transformer-based perception...","url_abs":"https://arxiv.org/abs/2512.01882","url_pdf":"https://arxiv.org/pdf/2512.01882v1","authors":"[\"Aref Ghoreishee\",\"Abhishek Mishra\",\"Lifeng Zhou\",\"John Walsh\",\"Nagarajan Kandasamy\"]","published":"2025-12-01T17:04:56Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Reinforcement Learning\",\"Transformer\"]","has_code":false}
