{"ID":2877554,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.03372","arxiv_id":"2509.03372","title":"An Effective Strategy for Modeling Score Ordinality and Non-uniform Intervals in Automated Speaking Assessment","abstract":"A recent line of research on automated speaking assessment (ASA) has benefited from self-supervised learning (SSL) representations, which capture rich acoustic and linguistic patterns in non-native speech without underlying assumptions of feature curation. However, speech-based SSL models capture acoustic-related traits but overlook linguistic content, while text-based SSL models rely on ASR output and fail to encode prosodic nuances. Moreover, most prior arts treat proficiency levels as nominal classes, ignoring their ordinal structure and non-uniform intervals between proficiency labels. To address these limitations, we propose an effective ASA approach combining SSL with handcrafted indicator features via a novel modeling paradigm. We further introduce a multi-margin ordinal loss that jointly models both the score ordinality and non-uniform intervals of proficiency labels. Extensive experiments on the TEEMI corpus show that our method consistently outperforms strong baselines and generalizes well to unseen prompts.","short_abstract":"A recent line of research on automated speaking assessment (ASA) has benefited from self-supervised learning (SSL) representations, which capture rich acoustic and linguistic patterns in non-native speech without underlying assumptions of feature curation. However, speech-based SSL models capture acoustic-related trait...","url_abs":"https://arxiv.org/abs/2509.03372","url_pdf":"https://arxiv.org/pdf/2509.03372v2","authors":"[\"Tien-Hong Lo\",\"Szu-Yu Chen\",\"Yao-Ting Sung\",\"Berlin Chen\"]","published":"2025-08-27T09:18:51Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.LG\",\"cs.SD\"]","methods":"[]","has_code":false}