{"ID":3049923,"CreatedAt":"2026-06-04T02:13:16.786527022Z","UpdatedAt":"2026-06-06T15:44:26.945507316Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.05102","arxiv_id":"2606.05102","title":"ZipSplat: Fewer Gaussians, Better Splats","abstract":"Feed-forward 3D Gaussian Splatting methods reconstruct a scene from posed or pose-free images in a single forward pass, yet current approaches predict one Gaussian per input pixel, tying the representation budget to camera resolution rather than scene complexity. A flat wall and a richly textured object thus produce equally many Gaussians despite very different geometric needs. We propose ZipSplat, a token-based feed-forward model that decouples Gaussian placement from the pixel grid. A multi-view backbone extracts dense visual tokens, and k-means clustering compresses them into a compact set of scene tokens. Cross- and self-attention refine these tokens, and a lightweight MLP decodes each into a group of Gaussians with unconstrained 3D positions. Because clustering is applied at inference, a single trained model spans the quality-efficiency curve without retraining. ZipSplat operates without ground-truth poses or intrinsics, yet sets a new state of the art on DL3DV and RealEstate10K with ${\\sim}6{\\times}$ fewer Gaussians than pixel-aligned methods, surpassing the best pose-free baseline by 2.1dB and 1.2dB PSNR, respectively. It further generalizes zero-shot to Mip-NeRF360 and ScanNet++, outperforming all comparable baselines. Our project page is at ${\\href{https://veichta.com/zipsplat}{https://veichta.com/zipsplat}}$.","short_abstract":"Feed-forward 3D Gaussian Splatting methods reconstruct a scene from posed or pose-free images in a single forward pass, yet current approaches predict one Gaussian per input pixel, tying the representation budget to camera resolution rather than scene complexity. A flat wall and a richly textured object thus produce eq...","url_abs":"https://arxiv.org/abs/2606.05102","url_pdf":"https://arxiv.org/pdf/2606.05102v1","authors":"[\"Alexander Veicht\",\"Sunghwan Hong\",\"Dániel Baráth\",\"Marc Pollefeys\"]","published":"2026-06-03T17:04:30Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","project_urls":"[\"https://veichta.com/zipsplat\"]","has_code":false,"code_links":[{"ID":612763,"CreatedAt":"2026-06-04T02:13:16.786527022Z","UpdatedAt":"2026-06-04T02:13:16.786527022Z","DeletedAt":null,"paper_id":3049923,"paper_url":"https://arxiv.org/abs/2606.05102","paper_title":"ZipSplat: Fewer Gaussians, Better Splats","repo_url":"https://github.com/cvg/ZipSplat","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
