{"ID":2883808,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.08086","arxiv_id":"2508.08086","title":"Matrix-3D: Omnidirectional Explorable 3D World Generation","abstract":"Explorable 3D world generation from a single image or text prompt forms a cornerstone of spatial intelligence. Recent works utilize video model to achieve wide-scope and generalizable 3D world generation. However, existing approaches often suffer from a limited scope in the generated scenes. In this work, we propose Matrix-3D, a framework that utilize panoramic representation for wide-coverage omnidirectional explorable 3D world generation that combines conditional video generation and panoramic 3D reconstruction. We first train a trajectory-guided panoramic video diffusion model that employs scene mesh renders as condition, to enable high-quality and geometrically consistent scene video generation. To lift the panorama scene video to 3D world, we propose two separate methods: (1) a feed-forward large panorama reconstruction model for rapid 3D scene reconstruction and (2) an optimization-based pipeline for accurate and detailed 3D scene reconstruction. To facilitate effective training, we also introduce the Matrix-Pano dataset, the first large-scale synthetic collection comprising 116K high-quality static panoramic video sequences with depth and trajectory annotations. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art performance in panoramic video generation and 3D world generation. See more in https://matrix-3d.github.io.","short_abstract":"Explorable 3D world generation from a single image or text prompt forms a cornerstone of spatial intelligence. Recent works utilize video model to achieve wide-scope and generalizable 3D world generation. However, existing approaches often suffer from a limited scope in the generated scenes. In this work, we propose Ma...","url_abs":"https://arxiv.org/abs/2508.08086","url_pdf":"https://arxiv.org/pdf/2508.08086v1","authors":"[\"Zhongqi Yang\",\"Wenhang Ge\",\"Yuqi Li\",\"Jiaqi Chen\",\"Haoyuan Li\",\"Mengyin An\",\"Fei Kang\",\"Hua Xue\",\"Baixin Xu\",\"Yuyang Yin\",\"Eric Li\",\"Yang Liu\",\"Yikai Wang\",\"Hao-Xiang Guo\",\"Yahui Zhou\"]","published":"2025-08-11T15:29:57Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.GR\"]","methods":"[\"Diffusion Model\",\"LoRA\"]","has_code":false}
