{"ID":2829922,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.11724","arxiv_id":"2512.11724","title":"From Signal to Turn: Interactional Friction in Modular Speech-to-Speech Pipelines","abstract":"While voice-based AI systems have achieved remarkable generative capabilities, their interactions often feel conversationally broken. This paper examines the interactional friction that emerges in modular Speech-to-Speech Retrieval-Augmented Generation (S2S-RAG) pipelines. By analyzing a representative production system, we move beyond simple latency metrics to identify three recurring patterns of conversational breakdown: (1) Temporal Misalignment, where system delays violate user expectations of conversational rhythm; (2) Expressive Flattening, where the loss of paralinguistic cues leads to literal, inappropriate responses; and (3) Repair Rigidity, where architectural gating prevents users from correcting errors in real-time. Through system-level analysis, we demonstrate that these friction points should not be understood as defects or failures, but as structural consequences of a modular design that prioritizes control over fluidity. We conclude that building natural spoken AI is an infrastructure design challenge, requiring a shift from optimizing isolated components to carefully choreographing the seams between them.","short_abstract":"While voice-based AI systems have achieved remarkable generative capabilities, their interactions often feel conversationally broken. This paper examines the interactional friction that emerges in modular Speech-to-Speech Retrieval-Augmented Generation (S2S-RAG) pipelines. By analyzing a representative production syste...","url_abs":"https://arxiv.org/abs/2512.11724","url_pdf":"https://arxiv.org/pdf/2512.11724v2","authors":"[\"Tittaya Mairittha\",\"Tanakon Sawanglok\",\"Panuwit Raden\",\"Jirapast Buntub\",\"Thanapat Warunee\",\"Napat Asawachaisuvikrom\",\"Thanaphum Saiwongin\"]","published":"2025-12-12T17:05:11Z","proceeding":"cs.HC","tasks":"[\"cs.HC\",\"cs.AI\",\"cs.CL\",\"cs.SE\"]","methods":"[\"RAG\"]","has_code":false}
