{"ID":2879795,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.15429","arxiv_id":"2508.15429","title":"AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation","abstract":"AudioSet is a widely used benchmark in the audio research community and has significantly advanced various audio-related tasks. However, persistent issues with label accuracy and completeness remain critical bottlenecks that limit performance in downstream applications.To address the aforementioned challenges, we propose a three-stage reannotation framework that harnesses general-purpose audio-language foundation models to systematically improve the label quality of AudioSet. The framework employs a cross-modal prompting strategy, inspired by the concept of prompt chaining, wherein prompts are sequentially composed to execute subtasks (audio comprehension, label synthesis, and semantic alignment). Leveraging this framework, we construct a high-quality, structured relabeled version of AudioSet-R. Extensive experiments conducted on representative audio classification models--including AST, PANNs, SSAST, and AudioMAE--consistently demonstrate substantial performance improvements, thereby validating the generalizability and effectiveness of the proposed approach in enhancing label reliability.The code is publicly available at: https://github.com/colaudiolab/AudioSet-R.","short_abstract":"AudioSet is a widely used benchmark in the audio research community and has significantly advanced various audio-related tasks. However, persistent issues with label accuracy and completeness remain critical bottlenecks that limit performance in downstream applications.To address the aforementioned challenges, we propo...","url_abs":"https://arxiv.org/abs/2508.15429","url_pdf":"https://arxiv.org/pdf/2508.15429v1","authors":"[\"Yulin Sun\",\"Qisheng Xu\",\"Yi Su\",\"Qian Zhu\",\"Yong Dou\",\"Xinwang Liu\",\"Kele Xu\"]","published":"2025-08-21T10:30:01Z","proceeding":"cs.SD","tasks":"[\"cs.SD\"]","methods":"[\"Large Language Model\"]","has_code":false,"code_links":[{"ID":610629,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2879795,"paper_url":"https://arxiv.org/abs/2508.15429","paper_title":"AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation","repo_url":"https://github.com/colaudiolab/AudioSet-R","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
