{"ID":2853926,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.21772","arxiv_id":"2510.21772","title":"Chebyshev Moment Regularization (CMR): Condition-Number Control with Moment Shaping","abstract":"We introduce \\textbf{Chebyshev Moment Regularization (CMR)}, a simple, architecture-agnostic loss that directly optimizes layer spectra. CMR jointly controls spectral edges via a log-condition proxy and shapes the interior via Chebyshev moments, with a decoupled, capped mixing rule that preserves task gradients. We prove strictly monotone descent for the condition proxy, bounded moment gradients, and orthogonal invariance. In an adversarial ``$κ$-stress'' setting (MNIST, 15-layer MLP), \\emph{compared to vanilla training}, CMR reduces mean layer condition numbers by $\\sim\\!10^3$ (from $\\approx3.9\\!\\times\\!10^3$ to $\\approx3.4$ in 5 epochs), increases average gradient magnitude, and restores test accuracy ( $\\approx10\\%\\!\\to\\!\\approx86\\%$ ). These results support \\textbf{optimization-driven spectral preconditioning}: directly steering models toward well-conditioned regimes for stable, accurate learning.","short_abstract":"We introduce \\textbf{Chebyshev Moment Regularization (CMR)}, a simple, architecture-agnostic loss that directly optimizes layer spectra. CMR jointly controls spectral edges via a log-condition proxy and shapes the interior via Chebyshev moments, with a decoupled, capped mixing rule that preserves task gradients. We pro...","url_abs":"https://arxiv.org/abs/2510.21772","url_pdf":"https://arxiv.org/pdf/2510.21772v1","authors":"[\"Jinwoo Baek\"]","published":"2025-10-17T06:54:41Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.NA\"]","methods":"[]","has_code":false}
