{"ID":2895521,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.08227","arxiv_id":"2507.08227","title":"RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing","abstract":"Automatic speaker verification (ASV) systems are often affected by spoofing attacks. Recent transformer-based models have improved anti-spoofing performance by learning strong feature representations. However, these models usually need high computing power. To address this, we introduce RawTFNet, a lightweight CNN model designed for audio signals. The RawTFNet separates feature processing along time and frequency dimensions, which helps to capture the fine-grained details of synthetic speech. We tested RawTFNet on the ASVspoof 2021 LA and DF evaluation datasets. The results show that RawTFNet reaches comparable performance to that of the state-of-the-art models, while also using fewer computing resources. The code and models will be made publicly available.","short_abstract":"Automatic speaker verification (ASV) systems are often affected by spoofing attacks. Recent transformer-based models have improved anti-spoofing performance by learning strong feature representations. However, these models usually need high computing power. To address this, we introduce RawTFNet, a lightweight CNN mode...","url_abs":"https://arxiv.org/abs/2507.08227","url_pdf":"https://arxiv.org/pdf/2507.08227v1","authors":"[\"Yang Xiao\",\"Ting Dang\",\"Rohan Kumar Das\"]","published":"2025-07-11T00:24:47Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.SD\"]","methods":"[\"Transformer\",\"Convolutional Neural Network\"]","has_code":false}