{"ID":2874838,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.02937","arxiv_id":"2509.02937","title":"Faster Gradient Methods for Highly-Smooth Stochastic Bilevel Optimization","abstract":"This paper studies the complexity of finding an $ε$-stationary point for stochastic bilevel optimization when the upper-level problem is nonconvex and the lower-level problem is strongly convex. Recent work proposed the first-order method, F${}^2$SA, achieving the $\\tilde{\\mathcal{O}}(ε^{-6})$ upper complexity bound for first-order smooth problems. This is slower than the optimal $Ω(ε^{-4})$ complexity lower bound in its single-level counterpart. In this work, we show that faster rates are achievable for higher-order smooth problems. We first reformulate F$^2$SA as approximating the hyper-gradient with a forward difference. Based on this observation, we propose a class of methods F${}^2$SA-$p$ that uses $p$th-order finite difference for hyper-gradient approximation and improves the upper bound to $\\tilde{\\mathcal{O}}(p ε^{-4-p/2})$ for $p$th-order smooth problems. Finally, we demonstrate that the $Ω(ε^{-4})$ lower bound also holds for stochastic bilevel problems when the high-order smoothness holds for the lower-level variable, indicating that the upper bound of F${}^2$SA-$p$ is nearly optimal in the highly smooth region $p = Ω( \\log ε^{-1} / \\log \\log ε^{-1})$.","short_abstract":"This paper studies the complexity of finding an $ε$-stationary point for stochastic bilevel optimization when the upper-level problem is nonconvex and the lower-level problem is strongly convex. Recent work proposed the first-order method, F${}^2$SA, achieving the $\\tilde{\\mathcal{O}}(ε^{-6})$ upper complexity bound fo...","url_abs":"https://arxiv.org/abs/2509.02937","url_pdf":"https://arxiv.org/pdf/2509.02937v2","authors":"[\"Lesi Chen\",\"Junru Li\",\"El Mahdi Chayti\",\"Jingzhao Zhang\"]","published":"2025-09-03T02:02:52Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"cs.LG\",\"stat.ML\"]","methods":"[]","has_code":false}
