{"ID":2869804,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.13878","arxiv_id":"2509.13878","title":"Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection","abstract":"Foundation models such as Wav2Vec2 excel at representation learning in speech tasks, including audio deepfake detection. However, after being fine-tuned on a fixed set of bonafide and spoofed audio clips, they often fail to generalize to novel deepfake methods not represented in training. To address this, we propose a mixture-of-LoRA-experts approach that integrates multiple low-rank adapters (LoRA) into the model's attention layers. A routing mechanism selectively activates specialized experts, enhancing adaptability to evolving deepfake attacks. Experimental results show that our method outperforms standard fine-tuning in both in-domain and out-of-domain scenarios, reducing equal error rates relative to baseline models. Notably, our best MoE-LoRA model lowers the average out-of-domain EER from 8.55\\% to 6.08\\%, demonstrating its effectiveness in achieving generalizable audio deepfake detection.","short_abstract":"Foundation models such as Wav2Vec2 excel at representation learning in speech tasks, including audio deepfake detection. However, after being fine-tuned on a fixed set of bonafide and spoofed audio clips, they often fail to generalize to novel deepfake methods not represented in training. To address this, we propose a...","url_abs":"https://arxiv.org/abs/2509.13878","url_pdf":"https://arxiv.org/pdf/2509.13878v1","authors":"[\"Janne Laakkonen\",\"Ivan Kukanov\",\"Ville Hautamäki\"]","published":"2025-09-17T10:13:58Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.LG\",\"cs.SD\"]","methods":"[\"LoRA\"]","has_code":false}
