{"ID":2856644,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10396","arxiv_id":"2510.10396","title":"MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations","abstract":"Humans rely on multisensory integration to perceive spatial environments, where auditory cues enable sound source localization in three-dimensional space. Despite the critical role of spatial audio in immersive technologies such as VR/AR, most existing multimodal datasets provide only monaural audio, which limits the development of spatial audio generation and understanding. To address these challenges, we introduce MRSAudio, a large-scale multimodal spatial audio dataset designed to advance research in spatial audio understanding and generation. MRSAudio spans four distinct components: MRSLife, MRSSpeech, MRSMusic, and MRSSing, covering diverse real-world scenarios. The dataset includes synchronized binaural and ambisonic audio, exocentric and egocentric video, motion trajectories, and fine-grained annotations such as transcripts, phoneme boundaries, lyrics, scores, and prompts. To demonstrate the utility and versatility of MRSAudio, we establish five foundational tasks: audio spatialization, and spatial text to speech, spatial singing voice synthesis, spatial music generation and sound event localization and detection. Results show that MRSAudio enables high-quality spatial modeling and supports a broad range of spatial audio research. Demos and dataset access are available at https://mrsaudio.github.io.","short_abstract":"Humans rely on multisensory integration to perceive spatial environments, where auditory cues enable sound source localization in three-dimensional space. Despite the critical role of spatial audio in immersive technologies such as VR/AR, most existing multimodal datasets provide only monaural audio, which limits the d...","url_abs":"https://arxiv.org/abs/2510.10396","url_pdf":"https://arxiv.org/pdf/2510.10396v3","authors":"[\"Wenxiang Guo\",\"Changhao Pan\",\"Zhiyuan Zhu\",\"Xintong Hu\",\"Yu Zhang\",\"Li Tang\",\"Rui Yang\",\"Han Wang\",\"Zongbao Zhang\",\"Yuhan Wang\",\"Yixuan Chen\",\"Hankun Xu\",\"Ke Xu\",\"Pengfei Fan\",\"Zhetao Chen\",\"Yanhao Yu\",\"Qiange Huang\",\"Fei Wu\",\"Zhou Zhao\"]","published":"2025-10-12T01:20:23Z","proceeding":"cs.SD","tasks":"[\"cs.SD\"]","methods":"[]","has_code":false}
