{"ID":2896116,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.07729","arxiv_id":"2507.07729","title":"Efficient Stochastic BFGS methods Inspired by Bayesian Principles","abstract":"Quasi-Newton methods are ubiquitous in deterministic local search due to their efficiency and low computational cost. This class of methods uses the history of gradient evaluations to approximate second-order derivatives. However, only noisy gradient observations are accessible in stochastic optimization; thus, deriving quasi-Newton methods in this setting is challenging. Although most existing quasi-Newton methods for stochastic optimization rely on deterministic equations that are modified to circumvent noise, we propose a new approach inspired by Bayesian inference to assimilate noisy gradient information and derive the stochastic counterparts to standard quasi-Newton methods. We focus on the derivations of stochastic BFGS and L-BFGS, but our methodology can also be employed to derive stochastic analogs of other quasi-Newton methods. The resulting stochastic BFGS (S-BFGS) and stochastic L-BFGS (L-S-BFGS) can effectively learn an inverse Hessian approximation even with small batch sizes. For a problem of dimension $d$, the iteration cost of S-BFGS is $\\mathcal{O}(d^2)$, and the cost of L-S-BFGS is $\\mathcal{O}(d)$. Numerical experiments with a dimensionality of up to $30,720$ demonstrate the efficiency and robustness of the proposed method.","short_abstract":"Quasi-Newton methods are ubiquitous in deterministic local search due to their efficiency and low computational cost. This class of methods uses the history of gradient evaluations to approximate second-order derivatives. However, only noisy gradient observations are accessible in stochastic optimization; thus, derivin...","url_abs":"https://arxiv.org/abs/2507.07729","url_pdf":"https://arxiv.org/pdf/2507.07729v3","authors":"[\"André Carlon\",\"Luis Espath\",\"Raúl Tempone\"]","published":"2025-07-10T13:08:55Z","proceeding":"math.OC","tasks":"[\"math.OC\"]","methods":"[]","has_code":false}
