{"ID":2856831,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10690","arxiv_id":"2510.10690","title":"Second-order Optimization under Heavy-Tailed Noise: Hessian Clipping and Sample Complexity Limits","abstract":"Heavy-tailed noise is pervasive in modern machine learning applications, arising from data heterogeneity, outliers, and non-stationary stochastic environments. While second-order methods can significantly accelerate convergence in light-tailed or bounded-noise settings, such algorithms are often brittle and lack guarantees under heavy-tailed noise -- precisely the regimes where robustness is most critical. In this work, we take a first step toward a theoretical understanding of second-order optimization under heavy-tailed noise. We consider a setting where stochastic gradients and Hessians have only bounded $p$-th moments, for some $p\\in (1,2]$, and establish tight lower bounds on the sample complexity of any second-order method. We then develop a variant of normalized stochastic gradient descent that leverages second-order information and provably matches these lower bounds. To address the instability caused by large deviations, we introduce a novel algorithm based on gradient and Hessian clipping, and prove high-probability upper bounds that nearly match the fundamental limits. Our results provide the first comprehensive sample complexity characterization for second-order optimization under heavy-tailed noise. This positions Hessian clipping as a robust and theoretically sound strategy for second-order algorithm design in heavy-tailed regimes.","short_abstract":"Heavy-tailed noise is pervasive in modern machine learning applications, arising from data heterogeneity, outliers, and non-stationary stochastic environments. While second-order methods can significantly accelerate convergence in light-tailed or bounded-noise settings, such algorithms are often brittle and lack guaran...","url_abs":"https://arxiv.org/abs/2510.10690","url_pdf":"https://arxiv.org/pdf/2510.10690v1","authors":"[\"Abdurakhmon Sadiev\",\"Peter Richtárik\",\"Ilyas Fatkhullin\"]","published":"2025-10-12T16:36:54Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"cs.LG\"]","methods":"[]","has_code":false}
