{"ID":2863474,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24544","arxiv_id":"2509.24544","title":"Quantitative convergence of trained single layer neural networks to Gaussian processes","abstract":"In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit. While previous work has established qualitative convergence under broad settings, precise, finite-width estimates remain limited, particularly during training. We provide explicit upper bounds on the quadratic Wasserstein distance between the network output and its Gaussian approximation at any training time $t \\ge 0$, demonstrating polynomial decay with network width. Our results quantify how architectural parameters, such as width and input dimension, influence convergence, and how training dynamics affect the approximation error.","short_abstract":"In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit. While previous work has established qualitative convergence under broad settings, precise, finite-width estimates remain limited, particularly...","url_abs":"https://arxiv.org/abs/2509.24544","url_pdf":"https://arxiv.org/pdf/2509.24544v3","authors":"[\"Eloy Mosig\",\"Andrea Agazzi\",\"Dario Trevisan\"]","published":"2025-09-29T09:59:27Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\",\"math.PR\"]","methods":"[]","has_code":false}
