{"ID":2834878,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00860","arxiv_id":"2512.00860","title":"The Spectral Dimension of NTKs is Constant: A Theory of Implicit Regularization, Finite-Width Stability, and Scalable Estimation","abstract":"Modern deep networks are heavily overparameterized yet often generalize well, suggesting a form of low intrinsic complexity not reflected by parameter counts. We study this complexity at initialization through the effective rank of the Neural Tangent Kernel (NTK) Gram matrix, $r_{\\text{eff}}(K) = (\\text{tr}(K))^2/\\|K\\|_F^2$. For i.i.d. data and the infinite-width NTK $k$, we prove a constant-limit law $\\lim_{n\\to\\infty} \\mathbb{E}[r_{\\text{eff}}(K_n)] = \\mathbb{E}[k(x, x)]^2 / \\mathbb{E}[k(x, x')^2] =: r_\\infty$, with sub-Gaussian concentration. We further establish finite-width stability: if the finite-width NTK deviates in operator norm by $O_p(m^{-1/2})$ (width $m$), then $r_{\\text{eff}}$ changes by $O_p(m^{-1/2})$. We design a scalable estimator using random output probes and a CountSketch of parameter Jacobians and prove conditional unbiasedness and consistency with explicit variance bounds. On CIFAR-10 with ResNet-20/56 (widths 16/32) across $n \\in \\{10^3, 5\\times10^3, 10^4, 2.5\\times10^4, 5\\times10^4\\}$, we observe $r_{\\text{eff}} \\approx 1.0\\text{--}1.3$ and slopes $\\approx 0$ in $n$, consistent with the theory, and the kernel-moment prediction closely matches fitted constants.","short_abstract":"Modern deep networks are heavily overparameterized yet often generalize well, suggesting a form of low intrinsic complexity not reflected by parameter counts. We study this complexity at initialization through the effective rank of the Neural Tangent Kernel (NTK) Gram matrix, $r_{\\text{eff}}(K) = (\\text{tr}(K))^2/\\|K\\|...","url_abs":"https://arxiv.org/abs/2512.00860","url_pdf":"https://arxiv.org/pdf/2512.00860v1","authors":"[\"Praveen Anilkumar Shukla\"]","published":"2025-11-30T12:14:21Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}
