{"ID":2866709,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.20321","arxiv_id":"2509.20321","title":"Conversational Speech Reveals Structural Robustness Failures in SpeechLLM Backbones","abstract":"LLMs serve as the backbone in SpeechLLMs, yet their behavior on spontaneous conversational input remains poorly understood. Conversational speech contains pervasive disfluencies -- interjections, edits, and parentheticals -- that are rare in the written corpora used for pre-training. Because gold disfluency removal is a deletion-only task, it serves as a controlled probe to determine whether a model performs faithful structural repair or biased reinterpretation. Using the DRES evaluation framework, we evaluate proprietary and open-source LLMs across architectures and scales. We show that model performance clusters into stable precision-recall regimes reflecting distinct editing policies. Notably, reasoning models systematically over-delete fluent content, revealing a bias toward semantic abstraction over structural fidelity. While fine-tuning achieves SOTA results, it harms generalization. Our findings demonstrate that robustness to speech is shaped by specific training objectives.","short_abstract":"LLMs serve as the backbone in SpeechLLMs, yet their behavior on spontaneous conversational input remains poorly understood. Conversational speech contains pervasive disfluencies -- interjections, edits, and parentheticals -- that are rare in the written corpora used for pre-training. Because gold disfluency removal is...","url_abs":"https://arxiv.org/abs/2509.20321","url_pdf":"https://arxiv.org/pdf/2509.20321v2","authors":"[\"Maria Teleki\",\"Sai Janjur\",\"Haoran Liu\",\"Oliver Grabner\",\"Ketan Verma\",\"Thomas Docog\",\"Xiangjue Dong\",\"Lingfeng Shi\",\"Cong Wang\",\"Stephanie Birkelbach\",\"Jason Kim\",\"Yin Zhang\",\"Éva Székely\",\"James Caverlee\"]","published":"2025-09-24T17:08:12Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"eess.AS\"]","methods":"[\"Large Language Model\"]","has_code":false}
