{"ID":2874617,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04161","arxiv_id":"2509.04161","title":"Wav2DF-TSL: Two-stage Learning with Efficient Pre-training and Hierarchical Experts Fusion for Robust Audio Deepfake Detection","abstract":"In recent years, self-supervised learning (SSL) models have made significant progress in audio deepfake detection (ADD) tasks. However, existing SSL models mainly rely on large-scale real speech for pre-training and lack the learning of spoofed samples, which leads to susceptibility to domain bias during the fine-tuning process of the ADD task. To this end, we propose a two-stage learning strategy (Wav2DF-TSL) based on pre-training and hierarchical expert fusion for robust audio deepfake detection. In the pre-training stage, we use adapters to efficiently learn artifacts from 3000 hours of unlabelled spoofed speech, improving the adaptability of front-end features while mitigating catastrophic forgetting. In the fine-tuning stage, we propose the hierarchical adaptive mixture of experts (HA-MoE) method to dynamically fuse multi-level spoofing cues through multi-expert collaboration with gated routing. Experimental results show that the proposed method significantly outperforms the baseline system on all four benchmark datasets, especially on the cross-domain In-the-wild dataset, achieving a 27.5% relative improvement in equal error rate (EER), outperforming the existing state-of-the-art systems. Index Terms: audio deepfake detection, self-supervised learning, parameter-efficient fine-tuning, mixture of experts","short_abstract":"In recent years, self-supervised learning (SSL) models have made significant progress in audio deepfake detection (ADD) tasks. However, existing SSL models mainly rely on large-scale real speech for pre-training and lack the learning of spoofed samples, which leads to susceptibility to domain bias during the fine-tunin...","url_abs":"https://arxiv.org/abs/2509.04161","url_pdf":"https://arxiv.org/pdf/2509.04161v1","authors":"[\"Yunqi Hao\",\"Yihao Chen\",\"Minqiang Xu\",\"Jianbo Zhan\",\"Liang He\",\"Lei Fang\",\"Sian Fang\",\"Lin Liu\"]","published":"2025-09-04T12:35:10Z","proceeding":"cs.SD","tasks":"[\"cs.SD\"]","methods":"[\"Mixture of Experts\"]","has_code":false}
