{"ID":2877283,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.20983","arxiv_id":"2508.20983","title":"Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System","abstract":"The SAFE Challenge evaluates synthetic speech detection across three tasks: unmodified audio, processed audio with compression artifacts, and laundered audio designed to evade detection. We systematically explore self-supervised learning (SSL) front-ends, training data compositions, and audio length configurations for robust deepfake detection. Our AASIST-based approach incorporates WavLM large frontend with RawBoost augmentation, trained on a multilingual dataset of 256,600 samples spanning 9 languages and over 70 TTS systems from CodecFake, MLAAD v5, SpoofCeleb, Famous Figures, and MAILABS. Through extensive experimentation with different SSL front-ends, three training data versions, and two audio lengths, we achieved second place in both Task 1 (unmodified audio detection) and Task 3 (laundered audio detection), demonstrating strong generalization and robustness.","short_abstract":"The SAFE Challenge evaluates synthetic speech detection across three tasks: unmodified audio, processed audio with compression artifacts, and laundered audio designed to evade detection. We systematically explore self-supervised learning (SSL) front-ends, training data compositions, and audio length configurations for...","url_abs":"https://arxiv.org/abs/2508.20983","url_pdf":"https://arxiv.org/pdf/2508.20983v2","authors":"[\"Hashim Ali\",\"Surya Subramani\",\"Lekha Bollinani\",\"Nithin Sai Adupa\",\"Sali El-Loh\",\"Hafiz Malik\"]","published":"2025-08-28T16:37:50Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.LG\"]","methods":"[]","has_code":false}
