{"ID":2867114,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.20396","arxiv_id":"2509.20396","title":"Data-Efficient ASR Personalization for Non-Normative Speech Using an Uncertainty-Based Phoneme Difficulty Score for Guided Sampling","abstract":"ASR systems struggle with non-normative speech due to high acoustic variability and data scarcity. We propose a data-efficient method using phoneme-level uncertainty to guide fine-tuning for personalization. Instead of computationally expensive ensembles, we leverage Variational Low-Rank Adaptation (VI LoRA) to estimate epistemic uncertainty in foundation models. These estimates form a composite Phoneme Difficulty Score (PhDScore) that drives a targeted oversampling strategy. Evaluated on English and German datasets, including a longitudinal analysis against two clinical reports taken one year apart, we demonstrate that: (1) VI LoRA-based uncertainty aligns better with expert clinical assessments than standard entropy; (2) PhDScore captures stable, persistent articulatory difficulties; and (3) uncertainty-guided sampling significantly improves ASR accuracy for impaired speech.","short_abstract":"ASR systems struggle with non-normative speech due to high acoustic variability and data scarcity. We propose a data-efficient method using phoneme-level uncertainty to guide fine-tuning for personalization. Instead of computationally expensive ensembles, we leverage Variational Low-Rank Adaptation (VI LoRA) to estimat...","url_abs":"https://arxiv.org/abs/2509.20396","url_pdf":"https://arxiv.org/pdf/2509.20396v2","authors":"[\"Niclas Pokel\",\"Pehuén Moure\",\"Roman Böhringer\",\"Yingqiang Gao\"]","published":"2025-09-23T12:54:30Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.AI\",\"cs.SD\"]","methods":"[\"LoRA\"]","has_code":false}
