{"ID":2877054,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.21102","arxiv_id":"2508.21102","title":"GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions","abstract":"We focus on the task of identifying the location of target regions from a natural language instruction and a front camera image captured by a mobility. This task is challenging because it requires both existence prediction and segmentation, particularly for stuff-type target regions with ambiguous boundaries. Existing methods often underperform in handling stuff-type target regions, in addition to absent or multiple targets. To overcome these limitations, we propose GENNAV, which predicts target existence and generates segmentation masks for multiple stuff-type target regions. To evaluate GENNAV, we constructed a novel benchmark called GRiN-Drive, which includes three distinct types of samples: no-target, single-target, and multi-target. GENNAV achieved superior performance over baseline methods on standard evaluation metrics. Furthermore, we conducted real-world experiments with four automobiles operated in five geographically distinct urban areas to validate its zero-shot transfer performance. In these experiments, GENNAV outperformed baseline methods and demonstrated its robustness across diverse real-world environments. The project page is available at https://gennav.vercel.app/.","short_abstract":"We focus on the task of identifying the location of target regions from a natural language instruction and a front camera image captured by a mobility. This task is challenging because it requires both existence prediction and segmentation, particularly for stuff-type target regions with ambiguous boundaries. Existing...","url_abs":"https://arxiv.org/abs/2508.21102","url_pdf":"https://arxiv.org/pdf/2508.21102v1","authors":"[\"Kei Katsumata\",\"Yui Iioka\",\"Naoki Hosomi\",\"Teruhisa Misu\",\"Kentaro Yamada\",\"Komei Sugiura\"]","published":"2025-08-28T08:09:38Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.RO\"]","methods":"[]","project_urls":"[\"https://gennav.vercel.app/\"]","has_code":false,"code_links":[{"ID":610333,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2877054,"paper_url":"https://arxiv.org/abs/2508.21102","paper_title":"GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions","repo_url":"https://github.com/nerfies/nerfies.github.io","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
