{"ID":2837282,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.18706","arxiv_id":"2511.18706","title":"CoD: A Diffusion Foundation Model for Image Compression","abstract":"Existing diffusion codecs typically build on text-to-image diffusion foundation models like Stable Diffusion. However, text conditioning is suboptimal from a compression perspective, hindering the potential of downstream diffusion codecs, particularly at ultra-low bitrates. To address it, we introduce \\textbf{CoD}, the first \\textbf{Co}mpression-oriented \\textbf{D}iffusion foundation model, trained from scratch to enable end-to-end optimization of both compression and generation. CoD is not a fixed codec but a general foundation model designed for various diffusion-based codecs. It offers several advantages: \\textbf{High compression efficiency}, replacing Stable Diffusion with CoD in downstream codecs like DiffC achieves SOTA results, especially at ultra-low bitrates (e.g., 0.0039 bpp); \\textbf{Low-cost and reproducible training}, 300$\\times$ faster training than Stable Diffusion ($\\sim$ 20 vs. $\\sim$ 6,250 A100 GPU days) on entirely open image-only datasets; \\textbf{Providing new insights}, e.g., We find pixel-space diffusion can achieve VTM-level PSNR with high perceptual quality and can outperform GAN-based codecs using fewer parameters. We hope CoD lays the foundation for future diffusion codec research. Codes are released at https://github.com/microsoft/GenCodec/tree/main/CoD.","short_abstract":"Existing diffusion codecs typically build on text-to-image diffusion foundation models like Stable Diffusion. However, text conditioning is suboptimal from a compression perspective, hindering the potential of downstream diffusion codecs, particularly at ultra-low bitrates. To address it, we introduce \\textbf{CoD}, the...","url_abs":"https://arxiv.org/abs/2511.18706","url_pdf":"https://arxiv.org/pdf/2511.18706v3","authors":"[\"Zhaoyang Jia\",\"Zihan Zheng\",\"Naifu Xue\",\"Jiahao Li\",\"Bin Li\",\"Zongyu Guo\",\"Xiaoyi Zhang\",\"Houqiang Li\",\"Yan Lu\"]","published":"2025-11-24T03:00:15Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Diffusion Model\",\"Generative Adversarial Network\"]","has_code":false,"code_links":[{"ID":606673,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2837282,"paper_url":"https://arxiv.org/abs/2511.18706","paper_title":"CoD: A Diffusion Foundation Model for Image Compression","repo_url":"https://github.com/microsoft/GenCodec","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
