{"ID":2838821,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.17822","arxiv_id":"2511.17822","title":"High-Accuracy List-Decodable Mean Estimation","abstract":"In list-decodable learning, we are given a set of data points such that an $α$-fraction of these points come from a nice distribution $D$, for some small $α\\ll 1$, and the goal is to output a short list of candidate solutions, such that at least one element of this list recovers some non-trivial information about $D$. By now, there is a large body of work on this topic; however, while many algorithms can achieve optimal list size in terms of $α$, all known algorithms must incur error which decays, in some cases quite poorly, with $1 / α$. In this paper, we ask if this is inherent: is it possible to trade off list size with accuracy in list-decodable learning? More formally, given $ε\u003e 0$, can we can output a slightly larger list in terms of $α$ and $ε$, but so that one element of this list has error at most $ε$ with the ground truth? We call this problem high-accuracy list-decodable learning. Our main result is that non-trivial high-accuracy guarantees, both information-theoretically and algorithmically, are possible for the canonical setting of list-decodable mean estimation of identity-covariance Gaussians. Specifically, we demonstrate that there exists a list of candidate means of size at most $L = \\exp \\left( O\\left( \\tfrac{\\log^2 1 / α}{ε^2} \\right)\\right)$ so that one of the elements of this list has $\\ell_2$ distance at most $ε$ to the true mean. We also design an algorithm that outputs such a list with runtime and sample complexity $n = d^{O(\\log L)} + \\exp \\exp (\\widetilde{O}(\\log L))$. We do so by demonstrating a completely novel proof of identifiability, as well as a new algorithmic way of leveraging this proof without the sum-of-squares hierarchy, which may be of independent technical interest.","short_abstract":"In list-decodable learning, we are given a set of data points such that an $α$-fraction of these points come from a nice distribution $D$, for some small $α\\ll 1$, and the goal is to output a short list of candidate solutions, such that at least one element of this list recovers some non-trivial information about $D$....","url_abs":"https://arxiv.org/abs/2511.17822","url_pdf":"https://arxiv.org/pdf/2511.17822v1","authors":"[\"Ziyun Chen\",\"Spencer Compton\",\"Daniel Kane\",\"Jerry Li\"]","published":"2025-11-21T22:35:19Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.DS\",\"stat.ML\"]","methods":"[]","has_code":false}