{"ID":2825677,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.09710","arxiv_id":"2601.09710","title":"Multi-Level Embedding Conformer Framework for Bengali Automatic Speech Recognition","abstract":"Bengali, spoken by over 300 million people, is a morphologically rich and lowresource language, posing challenges for automatic speech recognition (ASR). This research presents an end-to-end framework for Bengali ASR, building on a Conformer-CTC backbone with a multi-level embedding fusion mechanism that incorporates phoneme, syllable, and wordpiece representations. By enriching acoustic features with these linguistic embeddings, the model captures fine-grained phonetic cues and higher-level contextual patterns. The architecture employs early and late Conformer stages, with preprocessing steps including silence trimming, resampling, Log-Mel spectrogram extraction, and SpecAugment augmentation. The experimental results demonstrate the strong potential of the model, achieving a word error rate (WER) of 10.01% and a character error rate (CER) of 5.03%. These results demonstrate the effectiveness of combining multi-granular linguistic information with acoustic modeling, providing a scalable approach for low-resource ASR development.","short_abstract":"Bengali, spoken by over 300 million people, is a morphologically rich and lowresource language, posing challenges for automatic speech recognition (ASR). This research presents an end-to-end framework for Bengali ASR, building on a Conformer-CTC backbone with a multi-level embedding fusion mechanism that incorporates p...","url_abs":"https://arxiv.org/abs/2601.09710","url_pdf":"https://arxiv.org/pdf/2601.09710v1","authors":"[\"Md. Nazmus Sakib\",\"Golam Mahmud\",\"Md. Maruf Bangabashi\",\"Umme Ara Mahinur Istia\",\"Md. Jahidul Islam\",\"Partha Sarker\",\"Afra Yeamini Prity\"]","published":"2025-12-23T04:39:12Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.CL\"]","methods":"[]","has_code":false}
