{"ID":2836914,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.20216","arxiv_id":"2511.20216","title":"CostNav: A Navigation Benchmark for Real-World Economic-Cost Evaluation of Physical AI Agents","abstract":"While current navigation benchmarks prioritize task success in simplified settings, they neglect the multidimensional economic constraints essential for the real-world commercialization of autonomous delivery systems. We introduce CostNav, an Economic Navigation Benchmark that evaluates physical AI agents through comprehensive economic cost-revenue analysis aligned with real-world business operations. By integrating industry-standard data--such as Securities and Exchange Commission (SEC) filings and Abbreviated Injury Scale (AIS) injury reports--with Isaac Sim's detailed collision and cargo dynamics, CostNav transcends simple task completion to accurately evaluate business value in complex, real-world scenarios. To our knowledge, CostNav is the first physics-grounded economic benchmark that uses industry-standard regulatory and financial data to quantitatively expose the gap between navigation research metrics and commercial viability, revealing that optimizing for task success on a simplified task fundamentally differs from optimizing for real-world economic deployment. Evaluating seven baselines--two rule-based and five imitation learning--we find that no current method is economically viable, all yielding negative contribution margins. The best-performing method, CANVAS (-27.36\\$/run), equipped with only an RGB camera and GPS, outperforms LiDAR-equipped Nav2 w/ GPS (-35.46\\$/run). We challenge the community to develop navigation policies that achieve economic viability on CostNav. We remain method-agnostic, evaluating success solely on cost rather than the underlying architecture. All resources are available at https://github.com/worv-ai/CostNav.","short_abstract":"While current navigation benchmarks prioritize task success in simplified settings, they neglect the multidimensional economic constraints essential for the real-world commercialization of autonomous delivery systems. We introduce CostNav, an Economic Navigation Benchmark that evaluates physical AI agents through compr...","url_abs":"https://arxiv.org/abs/2511.20216","url_pdf":"https://arxiv.org/pdf/2511.20216v5","authors":"[\"Haebin Seong\",\"Sungmin Kim\",\"Yongjun Cho\",\"Myunchul Joe\",\"Geunwoo Kim\",\"Yubeen Park\",\"Sunhoo Kim\",\"Yoonshik Kim\",\"Suhwan Choi\",\"Jaeyoon Jung\",\"Jiyong Youn\",\"Jinmyung Kwak\",\"Sunghee Ahn\",\"Jaemin Lee\",\"Younggil Do\",\"Seungyeop Yi\",\"Woojin Cheong\",\"Minhyeok Oh\",\"Minchan Kim\",\"Seongjae Kang\",\"Samwoo Seong\",\"Youngjae Yu\",\"Yunsung Lee\"]","published":"2025-11-25T11:42:28Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CE\",\"cs.CV\",\"cs.LG\",\"cs.RO\"]","methods":"[]","has_code":false,"code_links":[{"ID":606639,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2836914,"paper_url":"https://arxiv.org/abs/2511.20216","paper_title":"CostNav: A Navigation Benchmark for Real-World Economic-Cost Evaluation of Physical AI Agents","repo_url":"https://github.com/worv-ai/CostNav","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
