{"ID":2830737,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.09678","arxiv_id":"2512.09678","title":"The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization","abstract":"In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we leverage duals of the Ky Fan $k$-norms to introduce a family of Muon-like algorithms we name Fanions, which are closely related to Dion. By working with duals of convex combinations of the Ky Fan $k$-norms with either the Frobenius norm or the $l_\\infty$ norm, we construct the families of F-Fanions and S-Fanions, respectively. Their most prominent members are F-Muon and S-Muon. We complement our theoretical analysis with an extensive empirical study of these algorithms across a wide range of tasks and settings, demonstrating that F-Muon and S-Muon consistently match Muon's performance, while outperforming vanilla Muon on a synthetic linear least squares problem.","short_abstract":"In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we leverage duals of the Ky Fan $k$-norms to introduce a family of Muon-like algorithms we name Fanion...","url_abs":"https://arxiv.org/abs/2512.09678","url_pdf":"https://arxiv.org/pdf/2512.09678v1","authors":"[\"Alexey Kravatskiy\",\"Ivan Kozyrev\",\"Nikolai Kozlov\",\"Alexander Vinogradov\",\"Daniil Merkulov\",\"Ivan Oseledets\"]","published":"2025-12-10T14:25:45Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Language Model\"]","has_code":false}
