{"ID":2869881,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.14304","arxiv_id":"2509.14304","title":"Deploying UDM Series in Real-Life Stuttered Speech Applications: A Clinical Evaluation Framework","abstract":"Stuttered and dysfluent speech detection systems have traditionally suffered from the trade-off between accuracy and clinical interpretability. While end-to-end deep learning models achieve high performance, their black-box nature limits clinical adoption. This paper looks at the Unconstrained Dysfluency Modeling (UDM) series-the current state-of-the-art framework developed by Berkeley that combines modular architecture, explicit phoneme alignment, and interpretable outputs for real-world clinical deployment. Through extensive experiments involving patients and certified speech-language pathologists (SLPs), we demonstrate that UDM achieves state-of-the-art performance (F1: 0.89+-0.04) while providing clinically meaningful interpretability scores (4.2/5.0). Our deployment study shows 87% clinician acceptance rate and 34% reduction in diagnostic time. The results provide strong evidence that UDM represents a practical pathway toward AI-assisted speech therapy in clinical environments.","short_abstract":"Stuttered and dysfluent speech detection systems have traditionally suffered from the trade-off between accuracy and clinical interpretability. While end-to-end deep learning models achieve high performance, their black-box nature limits clinical adoption. This paper looks at the Unconstrained Dysfluency Modeling (UDM)...","url_abs":"https://arxiv.org/abs/2509.14304","url_pdf":"https://arxiv.org/pdf/2509.14304v1","authors":"[\"Eric Zhang\",\"Li Wei\",\"Sarah Chen\",\"Michael Wang\"]","published":"2025-09-17T14:28:29Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.AI\",\"eess.AS\"]","methods":"[]","has_code":false}
