{"ID":2860333,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.04236","arxiv_id":"2510.04236","title":"Scaling Sequence-to-Sequence Generative Neural Rendering","abstract":"We present Kaleido, a family of generative models designed for photorealistic, unified object- and scene-level neural rendering. Kaleido operates on the principle that 3D can be regarded as a specialised sub-domain of video, expressed purely as a sequence-to-sequence image synthesis task. Through a systemic study of scaling sequence-to-sequence generative neural rendering, we introduce key architectural innovations that enable our model to: i) perform generative view synthesis without explicit 3D representations; ii) generate any number of 6-DoF target views conditioned on any number of reference views via a masked autoregressive framework; and iii) seamlessly unify 3D and video modelling within a single decoder-only rectified flow transformer. Within this unified framework, Kaleido leverages large-scale video data for pre-training, which significantly improves spatial consistency and reduces reliance on scarce, camera-labelled 3D datasets -- all without any architectural modifications. Kaleido sets a new state-of-the-art on a range of view synthesis benchmarks. Its zero-shot performance substantially outperforms other generative methods in few-view settings, and, for the first time, matches the quality of per-scene optimisation methods in many-view settings.","short_abstract":"We present Kaleido, a family of generative models designed for photorealistic, unified object- and scene-level neural rendering. Kaleido operates on the principle that 3D can be regarded as a specialised sub-domain of video, expressed purely as a sequence-to-sequence image synthesis task. Through a systemic study of sc...","url_abs":"https://arxiv.org/abs/2510.04236","url_pdf":"https://arxiv.org/pdf/2510.04236v3","authors":"[\"Shikun Liu\",\"Kam Woh Ng\",\"Wonbong Jang\",\"Jiadong Guo\",\"Junlin Han\",\"Haozhe Liu\",\"Yiannis Douratsos\",\"Juan C. Pérez\",\"Zijian Zhou\",\"Chi Phung\",\"Tao Xiang\",\"Juan-Manuel Pérez-Rúa\"]","published":"2025-10-05T15:03:31Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Transformer\"]","has_code":false}
