{"ID":3004696,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-05T11:43:53.432517148Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.03874","arxiv_id":"2606.03874","title":"DyaPlex: Full-Duplex Speech-Motion Model for Dyadic Interaction","abstract":"We present DyaPlex, a streaming, full-duplex speech-and-motion model designed for dyadic interaction. To capture the continuous and reciprocal nature of human communication, this full-duplex capability empowers the agent to simultaneously perceive and generate both speech and physical motion in a streaming fashion. At its core, our method leverages the strong priors of a foundational full-duplex speech model and integrates a novel motion pathway, thereby achieving fully synchronized multi-modal interaction. Specifically, we design a dual-tower Transformer architecture that preserves the zero-shot conversational reasoning of a frozen base speech model while constructing a deeply coupled, streaming motion pathway. By introducing a unified dyadic token interleaving mechanism and guiding cross-attention via a time-aligned speech-motion RoPE, our model effectively aligns autoregressive motions with rich latent speech features. Trained on the 4,000-hour Seamless Interaction dataset, our model effectively captures cross-speaker dependencies and establishes new state-of-the-art performance across both monadic and dyadic human interaction benchmarks.","short_abstract":"We present DyaPlex, a streaming, full-duplex speech-and-motion model designed for dyadic interaction. To capture the continuous and reciprocal nature of human communication, this full-duplex capability empowers the agent to simultaneously perceive and generate both speech and physical motion in a streaming fashion. At...","url_abs":"https://arxiv.org/abs/2606.03874","url_pdf":"https://arxiv.org/pdf/2606.03874v1","authors":"[\"Koki Nagano\",\"Hongyu Liu\",\"Seonwook Park\",\"Tianye Li\",\"Amrita Mazumdar\",\"Christian Jacobsen\",\"Shengze Wang\",\"Michael Stengel\",\"Rajarshi Roy\",\"Ka Chun Cheung\",\"Simon See\",\"Shalini De Mello\"]","published":"2026-06-02T16:42:56Z","proceeding":"cs.CV","tasks":"[\"cs.CV\",\"cs.RO\"]","methods":"[\"Transformer\"]","has_code":false}
