{"ID":2886795,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.02193","arxiv_id":"2508.02193","title":"Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference","abstract":"We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffusion Preview achieves an inference speed of 2,146 token/s over H20 GPUs while maintaining competitive performance across a sweep of standard code evaluation benchmarks, significantly faster than contemporary Mercury and Gemini Diffusion, establishing new state of the art on the speed-quality Pareto frontier for code models.","short_abstract":"We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated rec...","url_abs":"https://arxiv.org/abs/2508.02193","url_pdf":"https://arxiv.org/pdf/2508.02193v1","authors":"[\"Yuxuan Song\",\"Zheng Zhang\",\"Cheng Luo\",\"Pengyang Gao\",\"Fan Xia\",\"Hao Luo\",\"Zheng Li\",\"Yuehang Yang\",\"Hongli Yu\",\"Xingwei Qu\",\"Yuwei Fu\",\"Jing Su\",\"Ge Zhang\",\"Wenhao Huang\",\"Mingxuan Wang\",\"Lin Yan\",\"Xiaoying Jia\",\"Jingjing Liu\",\"Wei-Ying Ma\",\"Ya-Qin Zhang\",\"Yonghui Wu\",\"Hao Zhou\"]","published":"2025-08-04T08:43:01Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.LG\"]","methods":"[\"Diffusion Model\",\"Language Model\"]","has_code":false}