{"ID":2823571,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.24545","arxiv_id":"2512.24545","title":"More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization","abstract":"For extreme low-bit quantization of large language models (LLMs), Double Binary Factorization (DBF) is attractive as it enables efficient inference without sacrificing accuracy. However, the scaling parameters of DBF are too restrictive; after factoring out signs, all rank components share the same magnitude profile, resulting in performance saturation. We propose Multi-envelope DBF (MDBF), which retains a shared pair of 1-bit sign bases but replaces the single envelope with a rank-$l$ envelope. By sharing sign matrices among envelope components, MDBF effectively maintains a binary carrier and utilizes the limited memory budget for magnitude expressiveness. We also introduce a closed-form initialization and an alternating refinement method to optimize MDBF. Across the LLaMA and Qwen families, MDBF enhances perplexity and zero-shot accuracy over previous binary formats at matched bits per weight while preserving the same deployment-friendly inference primitive.","short_abstract":"For extreme low-bit quantization of large language models (LLMs), Double Binary Factorization (DBF) is attractive as it enables efficient inference without sacrificing accuracy. However, the scaling parameters of DBF are too restrictive; after factoring out signs, all rank components share the same magnitude profile, r...","url_abs":"https://arxiv.org/abs/2512.24545","url_pdf":"https://arxiv.org/pdf/2512.24545v1","authors":"[\"Yuma Ichikawa\",\"Yoshihiko Fujisawa\",\"Yudai Fujimoto\",\"Akira Sakai\",\"Katsuki Fujisawa\"]","published":"2025-12-31T01:04:34Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.CL\",\"stat.ML\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
