{"ID":2864991,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.21833","arxiv_id":"2509.21833","title":"Lightweight Front-end Enhancement for Robust ASR via Frame Resampling and Sub-Band Pruning","abstract":"Recent advancements in automatic speech recognition (ASR) have achieved notable progress, whereas robustness in noisy environments remains challenging. While speech enhancement (SE) front-ends are widely used to mitigate noise as a preprocessing step for ASR, they often introduce computational non-negligible overhead. This paper proposes optimizations to reduce SE computational costs without compromising ASR performance. Our approach integrates layer-wise frame resampling and progressive sub-band pruning. Frame resampling downsamples inputs within layers, utilizing residual connections to mitigate information loss. Simultaneously, sub-band pruning progressively excludes less informative frequency bands, further reducing computational demands. Extensive experiments on synthetic and real-world noisy datasets demonstrate that our system reduces SE computational overhead over 66 compared to the standard BSRNN, while maintaining strong ASR performance.","short_abstract":"Recent advancements in automatic speech recognition (ASR) have achieved notable progress, whereas robustness in noisy environments remains challenging. While speech enhancement (SE) front-ends are widely used to mitigate noise as a preprocessing step for ASR, they often introduce computational non-negligible overhead....","url_abs":"https://arxiv.org/abs/2509.21833","url_pdf":"https://arxiv.org/pdf/2509.21833v1","authors":"[\"Siyi Zhao\",\"Wei Wang\",\"Yanmin Qian\"]","published":"2025-09-26T03:49:32Z","proceeding":"cs.SD","tasks":"[\"cs.SD\"]","methods":"[]","has_code":false}
