{"ID":2886599,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.03937","arxiv_id":"2508.03937","title":"LCS-CTC: Leveraging Soft Alignments to Enhance Phonetic Transcription Robustness","abstract":"Phonetic speech transcription is crucial for fine-grained linguistic analysis and downstream speech applications. While Connectionist Temporal Classification (CTC) is a widely used approach for such tasks due to its efficiency, it often falls short in recognition performance, especially under unclear and nonfluent speech. In this work, we propose LCS-CTC, a two-stage framework for phoneme-level speech recognition that combines a similarity-aware local alignment algorithm with a constrained CTC training objective. By predicting fine-grained frame-phoneme cost matrices and applying a modified Longest Common Subsequence (LCS) algorithm, our method identifies high-confidence alignment zones which are used to constrain the CTC decoding path space, thereby reducing overfitting and improving generalization ability, which enables both robust recognition and text-free forced alignment. Experiments on both LibriSpeech and PPA demonstrate that LCS-CTC consistently outperforms vanilla CTC baselines, suggesting its potential to unify phoneme modeling across fluent and non-fluent speech.","short_abstract":"Phonetic speech transcription is crucial for fine-grained linguistic analysis and downstream speech applications. While Connectionist Temporal Classification (CTC) is a widely used approach for such tasks due to its efficiency, it often falls short in recognition performance, especially under unclear and nonfluent spee...","url_abs":"https://arxiv.org/abs/2508.03937","url_pdf":"https://arxiv.org/pdf/2508.03937v2","authors":"[\"Zongli Ye\",\"Jiachen Lian\",\"Akshaj Gupta\",\"Xuanru Zhou\",\"Haodong Li\",\"Krish Patel\",\"Hwi Joo Park\",\"Dingkun Zhou\",\"Chenxu Guo\",\"Shuhe Li\",\"Sam Wang\",\"Iris Zhou\",\"Cheol Jun Cho\",\"Zoe Ezzes\",\"Jet M. J. Vonk\",\"Brittany T. Morin\",\"Rian Bogley\",\"Lisa Wauters\",\"Zachary A. Miller\",\"Maria Luisa Gorno-Tempini\",\"Gopala Anumanchipalli\"]","published":"2025-08-05T21:59:35Z","proceeding":"eess.AS","tasks":"[\"eess.AS\"]","methods":"[]","has_code":false}
