{"ID":2891373,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.17527","arxiv_id":"2507.17527","title":"Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice","abstract":"Simultaneous Interpretation (SI) represents one of the most daunting frontiers in the translation industry, with product-level automatic systems long plagued by intractable challenges: subpar transcription and translation quality, lack of real-time speech generation, multi-speaker confusion, and translated speech inflation, especially in long-form discourses. In this study, we introduce Seed-LiveInterpret 2.0, an end-to-end SI model that delivers high-fidelity, ultra-low-latency speech-to-speech generation with voice cloning capabilities. As a fully operational product-level solution, Seed-LiveInterpret 2.0 tackles these challenges head-on through our novel duplex speech-to-speech understanding-generating framework. Experimental results demonstrate that through large-scale pretraining and reinforcement learning, the model achieves a significantly better balance between translation accuracy and latency, validated by human interpreters to exceed 70% correctness in complex scenarios. Notably, Seed-LiveInterpret 2.0 outperforms commercial SI solutions by significant margins in translation quality, while slashing the average latency of cloned speech from nearly 10 seconds to a near-real-time 3 seconds, which is around a near 70% reduction that drastically enhances practical usability.","short_abstract":"Simultaneous Interpretation (SI) represents one of the most daunting frontiers in the translation industry, with product-level automatic systems long plagued by intractable challenges: subpar transcription and translation quality, lack of real-time speech generation, multi-speaker confusion, and translated speech infla...","url_abs":"https://arxiv.org/abs/2507.17527","url_pdf":"https://arxiv.org/pdf/2507.17527v3","authors":"[\"Shanbo Cheng\",\"Yu Bao\",\"Zhichao Huang\",\"Yu Lu\",\"Ningxin Peng\",\"Lu Xu\",\"Runsheng Yu\",\"Rong Cao\",\"Yujiao Du\",\"Ting Han\",\"Yuxiang Hu\",\"Zeyang Li\",\"Sitong Liu\",\"Shengtao Ma\",\"Shiguang Pan\",\"Jiongchen Xiao\",\"Nuo Xu\",\"Meng Yang\",\"Rong Ye\",\"Yiming Yu\",\"Jun Zhang\",\"Ruofei Zhang\",\"Wanyi Zhang\",\"Wenhao Zhu\",\"Liehao Zou\",\"Lu Lu\",\"Yuxuan Wang\",\"Yonghui Wu\"]","published":"2025-07-23T14:07:41Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.SD\",\"eess.AS\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}