{"ID":2849440,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.22973","arxiv_id":"2510.22973","title":"Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method","abstract":"Driving scene generation is a critical domain for autonomous driving, enabling downstream applications, including perception and planning evaluation. Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities; however, their performance heavily depends on annotated occupancy data, which still remains scarce. To overcome this limitation, we curate Nuplan-Occ, the largest semantic occupancy dataset to date, constructed from the widely used Nuplan benchmark. Its scale and diversity facilitate not only large-scale generative modeling but also autonomous driving downstream applications. Based on this dataset, we develop a unified framework that jointly synthesizes high-quality semantic occupancy, multi-view videos, and LiDAR point clouds. Our approach incorporates a spatio-temporal disentangled architecture to support high-fidelity spatial expansion and temporal forecasting of 4D dynamic occupancy. To bridge modal gaps, we further propose two novel techniques: a Gaussian splatting-based sparse point map rendering strategy that enhances multi-view video generation, and a sensor-aware embedding strategy that explicitly models LiDAR sensor properties for realistic multi-LiDAR simulation. Extensive experiments demonstrate that our method achieves superior generation fidelity and scalability compared to existing approaches, and validates its practical value in downstream tasks. Repo: https://github.com/Arlo0o/UniScene-Unified-Occupancy-centric-Driving-Scene-Generation/tree/v2","short_abstract":"Driving scene generation is a critical domain for autonomous driving, enabling downstream applications, including perception and planning evaluation. Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities; however, their performance hea...","url_abs":"https://arxiv.org/abs/2510.22973","url_pdf":"https://arxiv.org/pdf/2510.22973v2","authors":"[\"Bohan Li\",\"Xin Jin\",\"Hu Zhu\",\"Hongsi Liu\",\"Ruikai Li\",\"Jiazhe Guo\",\"Kaiwen Cai\",\"Chao Ma\",\"Yueming Jin\",\"Hao Zhao\",\"Xiaokang Yang\",\"Wenjun Zeng\"]","published":"2025-10-27T03:52:45Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":607704,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2849440,"paper_url":"https://arxiv.org/abs/2510.22973","paper_title":"Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method","repo_url":"https://github.com/Arlo0o/UniScene-Unified-Occupancy-centric-Driving-Scene-Generation","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}