{"ID":2848074,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.26536","arxiv_id":"2510.26536","title":"RoboOS-NeXT: A Unified Memory-based Framework for Lifelong, Scalable, and Robust Multi-Robot Collaboration","abstract":"The proliferation of collaborative robots across diverse tasks and embodiments presents a central challenge: achieving lifelong adaptability, scalable coordination, and robust scheduling in multi-agent systems. Existing approaches, from vision-language-action (VLA) models to hierarchical frameworks, fall short due to their reliance on limited or dividual-agent memory. This fundamentally constrains their ability to learn over long horizons, scale to heterogeneous teams, or recover from failures, highlighting the need for a unified memory representation. To address these limitations, we introduce RoboOS-NeXT, a unified memory-based framework for lifelong, scalable, and robust multi-robot collaboration. At the core of RoboOS-NeXT is the novel Spatio-Temporal-Embodiment Memory (STEM), which integrates spatial scene geometry, temporal event history, and embodiment profiles into a shared representation. This memory-centric design is integrated into a brain-cerebellum framework, where a high-level brain model performs global planning by retrieving and updating STEM, while low-level controllers execute actions locally. This closed loop between cognition, memory, and execution enables dynamic task allocation, fault-tolerant collaboration, and consistent state synchronization. We conduct extensive experiments spanning complex coordination tasks in restaurants, supermarkets, and households. Our results demonstrate that RoboOS-NeXT achieves superior performance across heterogeneous embodiments, validating its effectiveness in enabling lifelong, scalable, and robust multi-robot collaboration. Project website: https://flagopen.github.io/RoboOS/","short_abstract":"The proliferation of collaborative robots across diverse tasks and embodiments presents a central challenge: achieving lifelong adaptability, scalable coordination, and robust scheduling in multi-agent systems. Existing approaches, from vision-language-action (VLA) models to hierarchical frameworks, fall short due to t...","url_abs":"https://arxiv.org/abs/2510.26536","url_pdf":"https://arxiv.org/pdf/2510.26536v1","authors":"[\"Huajie Tan\",\"Cheng Chi\",\"Xiansheng Chen\",\"Yuheng Ji\",\"Zhongxia Zhao\",\"Xiaoshuai Hao\",\"Yaoxu Lyu\",\"Mingyu Cao\",\"Junkai Zhao\",\"Huaihai Lyu\",\"Enshen Zhou\",\"Ning Chen\",\"Yankai Fu\",\"Cheng Peng\",\"Wei Guo\",\"Dong Liang\",\"Zhuo Chen\",\"Mengsi Lyu\",\"Chenrui He\",\"Yulong Ao\",\"Yonghua Lin\",\"Pengwei Wang\",\"Zhongyuan Wang\",\"Shanghang Zhang\"]","published":"2025-10-30T14:26:40Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[]","has_code":false}
