{"ID":3049899,"CreatedAt":"2026-06-04T02:13:16.786527022Z","UpdatedAt":"2026-06-06T15:44:26.945507316Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.05140","arxiv_id":"2606.05140","title":"Phase transitions for the noisy transformer model in arbitrary dimension","abstract":"We study the McKean--Vlasov free energy on the unit sphere associated with the unnormalized self-attention (USA) model for noisy transformer dynamics. We prove a sharp global-minimizer dichotomy in every dimension $d\\ge2$. There is a unique $β_*^{(d)}\u003e0$ such that \\begin{equation*} \\frac{I_{d/2+1}(β_*^{(d)})}{I_{d/2}(β_*^{(d)})}=\\frac1d, \\end{equation*} where $I_ν$ is the modified Bessel function of the first kind. For $0\u003cβ\\le β_*^{(d)}$, the uniform density remains the unique global minimizer up to the linear-stability threshold \\begin{equation*} K_\\#^{(d)}(β)=\\frac{β^{d/2}}{2^{d/2}Γ(d/2)I_{d/2}(β)}, \\end{equation*} and the phase transition is continuous. For $β\u003eβ_*^{(d)}$, the uniform density is not globally minimizing at $K_\\#^{(d)}(β)$, so the critical coupling satisfies $K_c\u003cK_\\#^{(d)}(β)$ and the transition is discontinuous. This result generalizes the authors' recent $d=2$ work arXiv:2604.16288 to arbitrary dimension. The proof uses the sharp Beckner--Onofri/logarithmic Hardy-Littlewood-Sobolev (HLS) inequality on the sphere, together with a Funk--Hecke/Bessel coefficient computation and a degree-two quartic obstruction.","short_abstract":"We study the McKean--Vlasov free energy on the unit sphere associated with the unnormalized self-attention (USA) model for noisy transformer dynamics. We prove a sharp global-minimizer dichotomy in every dimension $d\\ge2$. There is a unique $β_*^{(d)}\u003e0$ such that \\begin{equation*} \\frac{I_{d/2+1}(β_*^{(d)})}{I_{d/2}(β...","url_abs":"https://arxiv.org/abs/2606.05140","url_pdf":"https://arxiv.org/pdf/2606.05140v1","authors":"[\"Kyunghoo Mun\",\"Matthew Rosenzweig\"]","published":"2026-06-03T17:49:43Z","proceeding":"math.AP","tasks":"[\"math.AP\",\"math-ph\",\"math.PR\",\"stat.ML\"]","methods":"[\"Transformer\"]","has_code":false}
