{"ID":2836911,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.20207","arxiv_id":"2511.20207","title":"Adaptive SGD with Line-Search and Polyak Stepsizes: Nonconvex Convergence and Accelerated Rates","abstract":"We extend the convergence analysis of AdaSLS and AdaSPS in [Jiang and Stich, 2024] to the nonconvex setting, presenting a unified convergence analysis of stochastic gradient descent with adaptive Armijo line-search (AdaSLS) and Polyak stepsize (AdaSPS) for nonconvex optimization. Our contributions include: (1) an $\\mathcal{O}(1/\\sqrt{T})$ convergence rate for general nonconvex smooth functions, (2) an $\\mathcal{O}(1/T)$ rate under quasar-convexity and interpolation, and (3) an $\\mathcal{O}(1/T)$ rate under the strong growth condition for general nonconvex functions.","short_abstract":"We extend the convergence analysis of AdaSLS and AdaSPS in [Jiang and Stich, 2024] to the nonconvex setting, presenting a unified convergence analysis of stochastic gradient descent with adaptive Armijo line-search (AdaSLS) and Polyak stepsize (AdaSPS) for nonconvex optimization. Our contributions include: (1) an $\\mat...","url_abs":"https://arxiv.org/abs/2511.20207","url_pdf":"https://arxiv.org/pdf/2511.20207v4","authors":"[\"Haotian Wu\"]","published":"2025-11-25T11:33:00Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"stat.ML\"]","methods":"[]","has_code":false}
