{"ID":2828420,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.14225","arxiv_id":"2512.14225","title":"OmniGen: Unified Multimodal Sensor Generation for Autonomous Driving","abstract":"Autonomous driving has seen remarkable advancements, largely driven by extensive real-world data collection. However, acquiring diverse and corner-case data remains costly and inefficient. Generative models have emerged as a promising solution by synthesizing realistic sensor data. However, existing approaches primarily focus on single-modality generation, leading to inefficiencies and misalignment in multimodal sensor data. To address these challenges, we propose OminiGen, which generates aligned multimodal sensor data in a unified framework. Our approach leverages a shared Bird\\u2019s Eye View (BEV) space to unify multimodal features and designs a novel generalizable multimodal reconstruction method, UAE, to jointly decode LiDAR and multi-view camera data. UAE achieves multimodal sensor decoding through volume rendering, enabling accurate and flexible reconstruction. Furthermore, we incorporate a Diffusion Transformer (DiT) with a ControlNet branch to enable controllable multimodal sensor generation. Our comprehensive experiments demonstrate that OminiGen achieves desired performances in unified multimodal sensor data generation with multimodal consistency and flexible sensor adjustments.","short_abstract":"Autonomous driving has seen remarkable advancements, largely driven by extensive real-world data collection. However, acquiring diverse and corner-case data remains costly and inefficient. Generative models have emerged as a promising solution by synthesizing realistic sensor data. However, existing approaches primaril...","url_abs":"https://arxiv.org/abs/2512.14225","url_pdf":"https://arxiv.org/pdf/2512.14225v1","authors":"[\"Tao Tang\",\"Enhui Ma\",\"xia zhou\",\"Letian Wang\",\"Tianyi Yan\",\"Xueyang Zhang\",\"Kun Zhan\",\"Peng Jia\",\"XianPeng Lang\",\"Jia-Wang Bian\",\"Kaicheng Yu\",\"Xiaodan Liang\"]","published":"2025-12-16T09:18:15Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\",\"Transformer\"]","has_code":false}
