{"ID":2851251,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.20668","arxiv_id":"2510.20668","title":"From Masks to Worlds: A Hitchhiker's Guide to World Models","abstract":"This is not a typical survey of world models; it is a guide for those who want to build worlds. We do not aim to catalog every paper that has ever mentioned a ``world model\". Instead, we follow one clear road: from early masked models that unified representation learning across modalities, to unified architectures that share a single paradigm, then to interactive generative models that close the action-perception loop, and finally to memory-augmented systems that sustain consistent worlds over time. We bypass loosely related branches to focus on the core: the generative heart, the interactive loop, and the memory system. We show that this is the most promising path towards true world models.","short_abstract":"This is not a typical survey of world models; it is a guide for those who want to build worlds. We do not aim to catalog every paper that has ever mentioned a ``world model\". Instead, we follow one clear road: from early masked models that unified representation learning across modalities, to unified architectures that...","url_abs":"https://arxiv.org/abs/2510.20668","url_pdf":"https://arxiv.org/pdf/2510.20668v1","authors":"[\"Jinbin Bai\",\"Yu Lei\",\"Hecong Wu\",\"Yuchen Zhu\",\"Shufan Li\",\"Yi Xin\",\"Xiangtai Li\",\"Molei Tao\",\"Aditya Grover\",\"Ming-Hsuan Yang\"]","published":"2025-10-23T15:46:44Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}