{"ID":2843130,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.07819","arxiv_id":"2511.07819","title":"Human Motion Synthesis in 3D Scenes via Unified Scene Semantic Occupancy","abstract":"Human motion synthesis in 3D scenes relies heavily on scene comprehension, while current methods focus mainly on scene structure but ignore the semantic understanding. In this paper, we propose a human motion synthesis framework that take an unified Scene Semantic Occupancy (SSO) for scene representation, termed SSOMotion. We design a bi-directional tri-plane decomposition to derive a compact version of the SSO, and scene semantics are mapped to an unified feature space via CLIP encoding and shared linear dimensionality reduction. Such strategy can derive the fine-grained scene semantic structures while significantly reduce redundant computations. We further take these scene hints and movement direction derived from instructions for motion control via frame-wise scene query. Extensive experiments and ablation studies conducted on cluttered scenes using ShapeNet furniture, as well as scanned scenes from PROX and Replica datasets, demonstrate its cutting-edge performance while validating its effectiveness and generalization ability. Code will be publicly available at https://github.com/jingyugong/SSOMotion.","short_abstract":"Human motion synthesis in 3D scenes relies heavily on scene comprehension, while current methods focus mainly on scene structure but ignore the semantic understanding. In this paper, we propose a human motion synthesis framework that take an unified Scene Semantic Occupancy (SSO) for scene representation, termed SSOMot...","url_abs":"https://arxiv.org/abs/2511.07819","url_pdf":"https://arxiv.org/pdf/2511.07819v1","authors":"[\"Gong Jingyu\",\"Tong Kunkun\",\"Chen Zhuoran\",\"Yuan Chuanhan\",\"Chen Mingang\",\"Zhang Zhizhong\",\"Tan Xin\",\"Xie Yuan\"]","published":"2025-11-11T04:33:16Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":607185,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2843130,"paper_url":"https://arxiv.org/abs/2511.07819","paper_title":"Human Motion Synthesis in 3D Scenes via Unified Scene Semantic Occupancy","repo_url":"https://github.com/jingyugong/SSOMotion","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
