{"ID":2868751,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.15822","arxiv_id":"2509.15822","title":"Phase Transition for Stochastic Block Model with more than $\\sqrt{n}$ Communities","abstract":"Predictions from statistical physics postulate that recovery of the communities in Stochastic Block Model (SBM) is possible in polynomial time above, and only above, the Kesten-Stigum (KS) threshold. This conjecture has given rise to a rich literature, proving that non-trivial community recovery is indeed possible in SBM above the KS threshold, as long as the number $K$ of communities remains smaller than $\\sqrt{n}$, where $n$ is the number of nodes in the observed graph. Failure of low-degree polynomials below the KS threshold was also proven when $K=o(\\sqrt{n})$. When $K\\geq \\sqrt{n}$, Chin et al.(2025) recently prove that, in a sparse regime, community recovery in polynomial time is possible below the KS threshold by counting non-backtracking paths. This breakthrough result lead them to postulate a new threshold for the many communities regime $K\\geq \\sqrt{n}$. In this work, we provide evidences that confirm their conjecture for $K\\geq \\sqrt{n}$: 1- We prove that, for any density of the graph, low-degree polynomials fail to recover communities below the threshold postulated by Chin et al.(2025); 2- We prove that community recovery is possible in polynomial time above the postulated threshold, not only in the sparse regime of~Chin et al., but also in some (but not all) moderately sparse regimes by essentially by counting occurrences of cliques or self-avoiding paths of suitable size in the observed graph. In addition, we propose a detailed conjecture regarding the structure of motifs that are optimal in sparsity regimes not covered by cliques or self-avoiding paths counting. In particular, counting self-avoiding paths of length $\\log(n)$--which is closely related to spectral algorithms based on the Non-Backtracking operator--is optimal only in the sparse regime. Other motif counts--unrelated to spectral properties--should be considered in denser regimes.","short_abstract":"Predictions from statistical physics postulate that recovery of the communities in Stochastic Block Model (SBM) is possible in polynomial time above, and only above, the Kesten-Stigum (KS) threshold. This conjecture has given rise to a rich literature, proving that non-trivial community recovery is indeed possible in S...","url_abs":"https://arxiv.org/abs/2509.15822","url_pdf":"https://arxiv.org/pdf/2509.15822v2","authors":"[\"Alexandra Carpentier\",\"Christophe Giraud\",\"Nicolas Verzelen\"]","published":"2025-09-19T09:53:56Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\",\"math.PR\",\"math.ST\"]","methods":"[]","has_code":false}
