{"ID":2828457,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.14286","arxiv_id":"2512.14286","title":"An Additively Preconditioned Trust Region Strategy for Machine Learning","abstract":"Modern machine learning, especially the training of deep neural networks, depends on solving large-scale, highly nonconvex optimization problems, whose objective function exhibit a rough landscape. Motivated by the success of parallel preconditioners in the context of Krylov methods for large scale linear systems, we introduce a novel nonlinearly preconditioned Trust-Region method that makes use of an additive Schwarz correction at each minimization step, thereby accelerating convergence. More precisely, we propose a variant of the Additively Preconditioned Trust-Region Strategy (APTS), which combines a right-preconditioned additive Schwarz framework with a classical Trust-Region algorithm. By decomposing the parameter space into sub-domains, APTS solves local non-linear sub-problems in parallel and assembles their corrections additively. The resulting method not only shows fast convergence; due to the underlying Trust-Region strategy, it furthermore largely obviates the need for hyperparameter tuning.","short_abstract":"Modern machine learning, especially the training of deep neural networks, depends on solving large-scale, highly nonconvex optimization problems, whose objective function exhibit a rough landscape. Motivated by the success of parallel preconditioners in the context of Krylov methods for large scale linear systems, we i...","url_abs":"https://arxiv.org/abs/2512.14286","url_pdf":"https://arxiv.org/pdf/2512.14286v1","authors":"[\"Samuel Cruz Alegría\",\"Bindi Çapriqi\",\"Shega Likaj\",\"Ken Trotti\",\"Rolf Krause\"]","published":"2025-12-16T10:55:20Z","proceeding":"math.NA","tasks":"[\"math.NA\",\"math.OC\"]","methods":"[]","has_code":false}