{"ID":2888973,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.23155","arxiv_id":"2507.23155","title":"On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization","abstract":"In this paper, we study the problem of solving a simple bilevel optimization problem, where the upper-level objective is minimized over the solution set of the lower-level problem. We focus on the general setting in which both the upper- and lower-level objectives are smooth but potentially nonconvex. Due to the absence of additional structural assumptions for the lower-level objective-such as convexity or the Polyak-Łojasiewicz (PL) condition-guaranteeing global optimality is generally intractable. Instead, we introduce a suitable notion of stationarity for this class of problems and aim to design a first-order algorithm that finds such stationary points in polynomial time. Intuitively, stationarity in this setting means the upper-level objective cannot be substantially improved locally without causing a larger deterioration in the lower-level objective. To this end, we show that a simple and implementable variant of the dynamic barrier gradient descent (DBGD) framework can effectively solve the considered nonconvex simple bilevel problems up to stationarity. Specifically, to reach an $(ε_f, ε_g)$-stationary point-where $ε_f$ and $ε_g$ denote the target stationarity accuracies for the upper- and lower-level objectives, respectively-the considered method achieves a complexity of $\\mathcal{O}\\left(\\max\\left(ε_f^{-\\frac{3+p}{1+p}}, ε_g^{-\\frac{3+p}{2}}\\right)\\right)$, where $p \\geq 0$ is an arbitrary constant balancing the terms. To the best of our knowledge, this is the first complexity result for a discrete-time algorithm that guarantees joint stationarity for both levels in general nonconvex simple bilevel problems.","short_abstract":"In this paper, we study the problem of solving a simple bilevel optimization problem, where the upper-level objective is minimized over the solution set of the lower-level problem. We focus on the general setting in which both the upper- and lower-level objectives are smooth but potentially nonconvex. Due to the absenc...","url_abs":"https://arxiv.org/abs/2507.23155","url_pdf":"https://arxiv.org/pdf/2507.23155v1","authors":"[\"Jincheng Cao\",\"Ruichen Jiang\",\"Erfan Yazdandoost Hamedani\",\"Aryan Mokhtari\"]","published":"2025-07-30T23:10:29Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"cs.LG\"]","methods":"[]","has_code":false}
