{"ID":2886645,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.02000","arxiv_id":"2508.02000","title":"Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling","abstract":"Audio-visual temporal deepfake localization under the content-driven partial manipulation remains a highly challenging task. In this scenario, the deepfake regions are usually only spanning a few frames, with the majority of the rest remaining identical to the original. To tackle this, we propose a Hierarchical Boundary Modeling Network (HBMNet), which includes three modules: an Audio-Visual Feature Encoder that extracts discriminative frame-level representations, a Coarse Proposal Generator that predicts candidate boundary regions, and a Fine-grained Probabilities Generator that refines these proposals using bidirectional boundary-content probabilities. From the modality perspective, we enhance audio-visual learning through dedicated encoding and fusion, reinforced by frame-level supervision to boost discriminability. From the temporal perspective, HBMNet integrates multi-scale cues and bidirectional boundary-content relationships. Experiments show that encoding and fusion primarily improve precision, while frame-level supervision boosts recall. Each module (audio-visual fusion, temporal scales, bi-directionality) contributes complementary benefits, collectively enhancing localization performance. HBMNet outperforms BA-TFD and UMMAFormer and shows improved potential scalability with more training data.","short_abstract":"Audio-visual temporal deepfake localization under the content-driven partial manipulation remains a highly challenging task. In this scenario, the deepfake regions are usually only spanning a few frames, with the majority of the rest remaining identical to the original. To tackle this, we propose a Hierarchical Boundar...","url_abs":"https://arxiv.org/abs/2508.02000","url_pdf":"https://arxiv.org/pdf/2508.02000v1","authors":"[\"Xuanjun Chen\",\"Shih-Peng Cheng\",\"Jiawei Du\",\"Lin Zhang\",\"Xiaoxiao Miao\",\"Chung-Che Wang\",\"Haibin Wu\",\"Hung-yi Lee\",\"Jyh-Shing Roger Jang\"]","published":"2025-08-04T02:41:09Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.CV\",\"eess.AS\",\"eess.IV\"]","methods":"[]","has_code":false}
