{"ID":2836353,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.03062","arxiv_id":"2512.03062","title":"Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing","abstract":"Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \\textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \\textbf{PivGa}, an additional \\textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.","short_abstract":"Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid...","url_abs":"https://arxiv.org/abs/2512.03062","url_pdf":"https://arxiv.org/pdf/2512.03062v1","authors":"[\"Roman Rausch\",\"David Jansen\",\"Sukhbinder Singh\",\"Román Orús\"]","published":"2025-11-26T10:54:01Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"stat.ML\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
