{"ID":2835141,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00389","arxiv_id":"2512.00389","title":"Solving Neural Min-Max Games: The Role of Architecture, Initialization \u0026 Dynamics","abstract":"Many emerging applications - such as adversarial training, AI alignment, and robust optimization - can be framed as zero-sum games between neural nets, with von Neumann-Nash equilibria (NE) capturing the desirable system behavior. While such games often involve non-convex non-concave objectives, empirical evidence shows that simple gradient methods frequently converge, suggesting a hidden geometric structure. In this paper, we provide a theoretical framework that explains this phenomenon through the lens of hidden convexity and overparameterization. We identify sufficient conditions - spanning initialization, training dynamics, and network width - that guarantee global convergence to a NE in a broad class of non-convex min-max games. To our knowledge, this is the first such result for games that involve two-layer neural networks. Technically, our approach is twofold: (a) we derive a novel path-length bound for the alternating gradient descent-ascent scheme in min-max games; and (b) we show that the reduction from a hidden convex-concave geometry to two-sided Polyak-Łojasiewicz (PŁ) min-max condition hold with high probability under overparameterization, using tools from random matrix theory.","short_abstract":"Many emerging applications - such as adversarial training, AI alignment, and robust optimization - can be framed as zero-sum games between neural nets, with von Neumann-Nash equilibria (NE) capturing the desirable system behavior. While such games often involve non-convex non-concave objectives, empirical evidence show...","url_abs":"https://arxiv.org/abs/2512.00389","url_pdf":"https://arxiv.org/pdf/2512.00389v1","authors":"[\"Deep Patel\",\"Emmanouil-Vasileios Vlatakis-Gkaragkounis\"]","published":"2025-11-29T08:37:19Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.GT\",\"stat.ML\"]","methods":"[]","has_code":false}
