{"ID":2868964,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.16195","arxiv_id":"2509.16195","title":"FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation","abstract":"Neural audio codecs are a fundamental component of modern generative audio pipelines. Although recent codecs achieve strong low-bitrate reconstruction and provide powerful representations for downstream tasks, most are non-streamable, limiting their use in real-time applications. We present FocalCodec-Stream, a hybrid codec based on focal modulation that compresses speech into a single binary codebook at 0.55 - 0.80 kbps with a theoretical latency of 80 ms. Our approach combines multi-stage causal distillation of WavLM with targeted architectural improvements, including a lightweight refiner module that enhances quality under latency constraints. Experiments show that FocalCodec-Stream outperforms existing streamable codecs at comparable bitrates, while preserving both semantic and acoustic information. The result is a favorable trade-off between reconstruction quality, downstream task performance, latency, and efficiency. Code and checkpoints will be released at https://github.com/lucadellalib/focalcodec.","short_abstract":"Neural audio codecs are a fundamental component of modern generative audio pipelines. Although recent codecs achieve strong low-bitrate reconstruction and provide powerful representations for downstream tasks, most are non-streamable, limiting their use in real-time applications. We present FocalCodec-Stream, a hybrid...","url_abs":"https://arxiv.org/abs/2509.16195","url_pdf":"https://arxiv.org/pdf/2509.16195v1","authors":"[\"Luca Della Libera\",\"Cem Subakan\",\"Mirco Ravanelli\"]","published":"2025-09-19T17:57:13Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.AI\",\"cs.LG\",\"eess.AS\"]","methods":"[]","has_code":false,"code_links":[{"ID":609638,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2868964,"paper_url":"https://arxiv.org/abs/2509.16195","paper_title":"FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation","repo_url":"https://github.com/lucadellalib/focalcodec","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
