{"ID":2846552,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.01315","arxiv_id":"2511.01315","title":"MVSMamba: Multi-View Stereo with State Space Model","abstract":"Robust feature representations are essential for learning-based Multi-View Stereo (MVS), which relies on accurate feature matching. Recent MVS methods leverage Transformers to capture long-range dependencies based on local features extracted by conventional feature pyramid networks. However, the quadratic complexity of Transformer-based MVS methods poses challenges to balance performance and efficiency. Motivated by the global modeling capability and linear complexity of the Mamba architecture, we propose MVSMamba, the first Mamba-based MVS network. MVSMamba enables efficient global feature aggregation with minimal computational overhead. To fully exploit Mamba's potential in MVS, we propose a Dynamic Mamba module (DM-module) based on a novel reference-centered dynamic scanning strategy, which enables: (1) Efficient intra- and inter-view feature interaction from the reference to source views, (2) Omnidirectional multi-view feature representations, and (3) Multi-scale global feature aggregation. Extensive experimental results demonstrate MVSMamba outperforms state-of-the-art MVS methods on the DTU dataset and the Tanks-and-Temples benchmark with both superior performance and efficiency. The source code is available at https://github.com/JianfeiJ/MVSMamba.","short_abstract":"Robust feature representations are essential for learning-based Multi-View Stereo (MVS), which relies on accurate feature matching. Recent MVS methods leverage Transformers to capture long-range dependencies based on local features extracted by conventional feature pyramid networks. However, the quadratic complexity of...","url_abs":"https://arxiv.org/abs/2511.01315","url_pdf":"https://arxiv.org/pdf/2511.01315v1","authors":"[\"Jianfei Jiang\",\"Qiankun Liu\",\"Hongyuan Liu\",\"Haochen Yu\",\"Liyong Wang\",\"Jiansheng Chen\",\"Huimin Ma\"]","published":"2025-11-03T07:59:07Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Transformer\"]","has_code":false,"code_links":[{"ID":607442,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2846552,"paper_url":"https://arxiv.org/abs/2511.01315","paper_title":"MVSMamba: Multi-View Stereo with State Space Model","repo_url":"https://github.com/JianfeiJ/MVSMamba","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}