{"ID":2883265,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.08938","arxiv_id":"2508.08938","title":"DeCRED: Decoder-Centric Regularization for Encoder-Decoder Based Speech Recognition","abstract":"This paper presents a simple yet effective regularization for the internal language model induced by the decoder in encoder-decoder ASR models, thereby improving robustness and generalization in both in- and out-of-domain settings. The proposed method, Decoder-Centric Regularization in Encoder-Decoder (DeCRED), adds auxiliary classifiers to the decoder, enabling next token prediction via intermediate logits. Empirically, DeCRED reduces the mean internal LM BPE perplexity by 36.6% relative to 11 test sets. Furthermore, this translates into actual WER improvements over the baseline in 5 of 7 in-domain and 3 of 4 out-of-domain test sets, reducing macro WER from 6.4% to 6.3% and 18.2% to 16.2%, respectively. On TEDLIUM3, DeCRED achieves 7.0% WER, surpassing the baseline and encoder-centric InterCTC regularization by 0.6% and 0.5%, respectively. Finally, we compare DeCRED with OWSM v3.1 and Whisper-medium, showing competitive WERs despite training on much less data with fewer parameters.","short_abstract":"This paper presents a simple yet effective regularization for the internal language model induced by the decoder in encoder-decoder ASR models, thereby improving robustness and generalization in both in- and out-of-domain settings. The proposed method, Decoder-Centric Regularization in Encoder-Decoder (DeCRED), adds au...","url_abs":"https://arxiv.org/abs/2508.08938","url_pdf":"https://arxiv.org/pdf/2508.08938v1","authors":"[\"Alexander Polok\",\"Santosh Kesiraju\",\"Karel Beneš\",\"Bolaji Yusuf\",\"Lukáš Burget\",\"Jan Černocký\"]","published":"2025-08-12T13:44:50Z","proceeding":"eess.AS","tasks":"[\"eess.AS\"]","methods":"[\"Language Model\"]","has_code":false}
