{"ID":2824777,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.22489","arxiv_id":"2512.22489","title":"Tracking by Predicting 3-D Gaussians Over Time","abstract":"We propose Video Gaussian Masked Autoencoders (Video-GMAE), a self-supervised approach for representation learning that encodes a sequence of images into a set of Gaussian splats moving over time. Representing a video as a set of Gaussians enforces a reasonable inductive bias: that 2-D videos are often consistent projections of a dynamic 3-D scene. We find that tracking emerges when pretraining a network with this architecture. Mapping the trajectory of the learnt Gaussians onto the image plane gives zero-shot tracking performance comparable to state-of-the-art. With small-scale finetuning, our models achieve 34.6% improvement on Kinetics, and 13.1% on Kubric datasets, surpassing existing self-supervised video approaches. The project page and code are publicly available at https://videogmae.org/ and https://github.com/tekotan/video-gmae.","short_abstract":"We propose Video Gaussian Masked Autoencoders (Video-GMAE), a self-supervised approach for representation learning that encodes a sequence of images into a set of Gaussian splats moving over time. Representing a video as a set of Gaussians enforces a reasonable inductive bias: that 2-D videos are often consistent proje...","url_abs":"https://arxiv.org/abs/2512.22489","url_pdf":"https://arxiv.org/pdf/2512.22489v2","authors":"[\"Tanish Baranwal\",\"Himanshu Gaurav Singh\",\"Jathushan Rajasegaran\",\"Jitendra Malik\"]","published":"2025-12-27T06:16:54Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","project_urls":"[\"https://videogmae.org/\"]","has_code":false,"code_links":[{"ID":605611,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2824777,"paper_url":"https://arxiv.org/abs/2512.22489","paper_title":"Tracking by Predicting 3-D Gaussians Over Time","repo_url":"https://github.com/tekotan/video-gmae","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
