{"ID":2878873,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.17282","arxiv_id":"2508.17282","title":"ERF-BA-TFD+: A Multimodal Model for Audio-Visual Deepfake Detection","abstract":"Deepfake detection is a critical task in identifying manipulated multimedia content. In real-world scenarios, deepfake content can manifest across multiple modalities, including audio and video. To address this challenge, we present ERF-BA-TFD+, a novel multimodal deepfake detection model that combines enhanced receptive field (ERF) and audio-visual fusion. Our model processes both audio and video features simultaneously, leveraging their complementary information to improve detection accuracy and robustness. The key innovation of ERF-BA-TFD+ lies in its ability to model long-range dependencies within the audio-visual input, allowing it to better capture subtle discrepancies between real and fake content. In our experiments, we evaluate ERF-BA-TFD+ on the DDL-AV dataset, which consists of both segmented and full-length video clips. Unlike previous benchmarks, which focused primarily on isolated segments, the DDL-AV dataset allows us to assess the model's performance in a more comprehensive and realistic setting. Our method achieves state-of-the-art results on this dataset, outperforming existing techniques in terms of both accuracy and processing speed. The ERF-BA-TFD+ model demonstrated its effectiveness in the \"Workshop on Deepfake Detection, Localization, and Interpretability,\" Track 2: Audio-Visual Detection and Localization (DDL-AV), and won first place in this competition.","short_abstract":"Deepfake detection is a critical task in identifying manipulated multimedia content. In real-world scenarios, deepfake content can manifest across multiple modalities, including audio and video. To address this challenge, we present ERF-BA-TFD+, a novel multimodal deepfake detection model that combines enhanced recepti...","url_abs":"https://arxiv.org/abs/2508.17282","url_pdf":"https://arxiv.org/pdf/2508.17282v2","authors":"[\"Xin Zhang\",\"Jiaming Chu\",\"Jian Zhao\",\"Yuchu Jiang\",\"Xu Yang\",\"Lei Jin\",\"Chi Zhang\",\"Xuelong Li\"]","published":"2025-08-24T10:03:46Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.SD\"]","methods":"[]","has_code":false}
