{"ID":2893043,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.13804","arxiv_id":"2507.13804","title":"Gradient descent avoids strict saddles with a simple line-search method too","abstract":"It is known that gradient descent (GD) on a $C^2$ cost function generically avoids strict saddle points when using a small, constant step size. However, no such guarantee existed for GD with a line-search method. We provide one for a modified version of the standard Armijo backtracking method with generic, arbitrarily large initial step size. The proof underlines the double role of the Luzin $N^{-1}$ property for the iteration maps, and allows to forgo the habitual Lipschitz gradient assumption. We extend this to the Riemannian setting (RGD), assuming the retraction is real analytic (though the cost function still only needs to be $C^2$). In closing, we also improve guarantees for RGD with a constant step size in some scenarios.","short_abstract":"It is known that gradient descent (GD) on a $C^2$ cost function generically avoids strict saddle points when using a small, constant step size. However, no such guarantee existed for GD with a line-search method. We provide one for a modified version of the standard Armijo backtracking method with generic, arbitrarily...","url_abs":"https://arxiv.org/abs/2507.13804","url_pdf":"https://arxiv.org/pdf/2507.13804v2","authors":"[\"Andreea-Alexandra Muşat\",\"Nicolas Boumal\"]","published":"2025-07-18T10:32:42Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"math.DS\",\"math.NA\"]","methods":"[]","has_code":false}
