{"ID":2863010,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.26645","arxiv_id":"2509.26645","title":"TTT3R: 3D Reconstruction as Test-Time Training","abstract":"Modern Recurrent Neural Networks have become a competitive architecture for 3D reconstruction due to their linear-time complexity. However, their performance degrades significantly when applied beyond the training context length, revealing limited length generalization. In this work, we revisit the 3D reconstruction foundation models from a Test-Time Training perspective, framing their designs as an online learning problem. Building on this perspective, we leverage the alignment confidence between the memory state and incoming observations to derive a closed-form learning rate for memory updates, to balance between retaining historical information and adapting to new observations. This training-free intervention, termed TTT3R, substantially improves length generalization, achieving a $2\\times$ improvement in global pose estimation over baselines, while operating at 20 FPS with just 6 GB of GPU memory to process thousands of images. Code is available in https://rover-xingyu.github.io/TTT3R","short_abstract":"Modern Recurrent Neural Networks have become a competitive architecture for 3D reconstruction due to their linear-time complexity. However, their performance degrades significantly when applied beyond the training context length, revealing limited length generalization. In this work, we revisit the 3D reconstruction fo...","url_abs":"https://arxiv.org/abs/2509.26645","url_pdf":"https://arxiv.org/pdf/2509.26645v4","authors":"[\"Xingyu Chen\",\"Yue Chen\",\"Yuliang Xiu\",\"Andreas Geiger\",\"Anpei Chen\"]","published":"2025-09-30T17:59:51Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
