{"ID":2850887,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.22065","arxiv_id":"2510.22065","title":"A Retraction-free Method for Nonsmooth Minimax Optimization over a Compact Manifold","abstract":"We study the minimax problem $\\min_{x\\in M} \\max_y f_r(x,y):=f(x,y)-h(y)$, where $M$ is a compact submanifold, $f$ is continuously differentiable in $(x, y)$, $h$ is a closed, weakly-convex (possibly non-smooth) function and we assume that the regularized coupling function $-f_r(x,\\cdot)$ is either $μ$-PL for some $μ\u003e0$ or concave ($μ= 0$) for any fixed $x$ in the vicinity of $M$. To address the nonconvexity due to the manifold constraint, we use an exact penalty for the constraint $x \\in M$, and enforcing a convex constraint $x\\in X$ for some $X \\supset M$, onto which projections can be computed efficiently. Building upon this new formulation for the manifold minimax problem in question, a single-loop smoothed manifold gradient descent-ascent (sm-MGDA) algorithm is proposed. Theoretically, any limit point of sm-MGDA sequence is a stationary point of the manifold minimax problem and sm-MGDA can generate an $O(ε)$-stationary point of the original problem with $O(1/ε^2)$ and $\\tilde{O}(1/ε^4)$ complexity for $μ\u003e 0$ and $μ= 0$ scenarios, respectively. Moreover, for the $μ= 0$ setting, through adopting Tikhonov regularization of the dual, one can improve the complexity to $O(1/ε^3)$ at the expense of asymptotic stationarity. The key component, common in the analysis of all cases, is to connect $ε$-stationary points between the penalized problem and the original problem by showing that the constraint $x \\in X$ becomes inactive and the penalty term tends to $0$ along any convergent subsequence. To our knowledge, sm-MGDA is the first retraction-free algorithm for minimax problems over compact submanifolds, and this is a very desirable algorithmic property since through avoiding retractions, one can get away with matrix orthogonalization subroutines required for computing retractions to manifolds arising in practice, which are not GPU friendly.","short_abstract":"We study the minimax problem $\\min_{x\\in M} \\max_y f_r(x,y):=f(x,y)-h(y)$, where $M$ is a compact submanifold, $f$ is continuously differentiable in $(x, y)$, $h$ is a closed, weakly-convex (possibly non-smooth) function and we assume that the regularized coupling function $-f_r(x,\\cdot)$ is either $μ$-PL for some $μ\u003e0...","url_abs":"https://arxiv.org/abs/2510.22065","url_pdf":"https://arxiv.org/pdf/2510.22065v2","authors":"[\"Necdet Serhat Aybat\",\"Jiang Hu\",\"Zhanwang Deng\"]","published":"2025-10-24T23:03:08Z","proceeding":"math.OC","tasks":"[\"math.OC\"]","methods":"[]","has_code":false}
