{"ID":2835974,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.22429","arxiv_id":"2511.22429","title":"Fin3R: Fine-tuning Feed-forward 3D Reconstruction Models via Monocular Knowledge Distillation","abstract":"We present Fin3R, a simple, effective, and general fine-tuning method for feed-forward 3D reconstruction models. The family of feed-forward reconstruction model regresses pointmap of all input images to a reference frame coordinate system, along with other auxiliary outputs, in a single forward pass. However, we find that current models struggle with fine geometry and robustness due to (\\textit{i}) the scarcity of high-fidelity depth and pose supervision and (\\textit{ii}) the inherent geometric misalignment from multi-view pointmap regression. Fin3R jointly tackles two issues with an extra lightweight fine-tuning step. We freeze the decoder, which handles view matching, and fine-tune only the image encoder-the component dedicated to feature extraction. The encoder is enriched with fine geometric details distilled from a strong monocular teacher model on large, unlabeled datasets, using a custom, lightweight LoRA adapter. We validate our method on a wide range of models, including DUSt3R, MASt3R, CUT3R, and VGGT. The fine-tuned models consistently deliver sharper boundaries, recover complex structures, and achieve higher geometric accuracy in both single- and multi-view settings, while adding only the tiny LoRA weights, which leave test-time memory and latency virtually unchanged. Project page: \\href{http://visual-ai.github.io/fin3r}{https://visual-ai.github.io/fin3r}","short_abstract":"We present Fin3R, a simple, effective, and general fine-tuning method for feed-forward 3D reconstruction models. The family of feed-forward reconstruction model regresses pointmap of all input images to a reference frame coordinate system, along with other auxiliary outputs, in a single forward pass. However, we find t...","url_abs":"https://arxiv.org/abs/2511.22429","url_pdf":"https://arxiv.org/pdf/2511.22429v1","authors":"[\"Weining Ren\",\"Hongjun Wang\",\"Xiao Tan\",\"Kai Han\"]","published":"2025-11-27T13:10:19Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"LoRA\"]","has_code":false}
