{"ID":2833807,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.02400","arxiv_id":"2512.02400","title":"Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation","abstract":"Object-goal navigation in open-vocabulary settings requires agents to locate novel objects in unseen environments, yet existing approaches suffer from opaque decision-making processes and low success rate on locating unseen objects. To address these challenges, we propose Nav-$R^2$, a framework that explicitly models two critical types of relationships, target-environment modeling and environment-action planning, through structured Chain-of-Thought (CoT) reasoning coupled with a Similarity-Aware Memory. We construct a Nav$R^2$-CoT dataset that teaches the model to perceive the environment, focus on target-related objects in the surrounding context and finally make future action plans. Our SA-Mem preserves the most target-relevant and current observation-relevant features from both temporal and semantic perspectives by compressing video frames and fusing historical observations, while introducing no additional parameters. Compared to previous methods, Nav-R^2 achieves state-of-the-art performance in localizing unseen objects through a streamlined and efficient pipeline, avoiding overfitting to seen object categories while maintaining real-time inference at 2Hz. Resources will be made publicly available at \\href{https://github.com/AMAP-EAI/Nav-R2}{github link}.","short_abstract":"Object-goal navigation in open-vocabulary settings requires agents to locate novel objects in unseen environments, yet existing approaches suffer from opaque decision-making processes and low success rate on locating unseen objects. To address these challenges, we propose Nav-$R^2$, a framework that explicitly models t...","url_abs":"https://arxiv.org/abs/2512.02400","url_pdf":"https://arxiv.org/pdf/2512.02400v1","authors":"[\"Wentao Xiang\",\"Haokang Zhang\",\"Tianhang Yang\",\"Zedong Chu\",\"Ruihang Chu\",\"Shichao Xie\",\"Yujian Yuan\",\"Jian Sun\",\"Zhining Gu\",\"Junjie Wang\",\"Xiaolong Wu\",\"Mu Xu\",\"Yujiu Yang\"]","published":"2025-12-02T04:21:02Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":606355,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2833807,"paper_url":"https://arxiv.org/abs/2512.02400","paper_title":"Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation","repo_url":"https://github.com/AMAP-EAI/Nav-R2","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
