{"ID":2879527,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.16531","arxiv_id":"2508.16531","title":"Quality control in sublinear time: a case study via random graphs","abstract":"Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call \"Quality Control Problems.\" These problems are specified by a (positive, real-valued) \"quality function\" $ρ$ and a distribution $D$ such that, with high probability, a sample drawn from $D$ is \"high quality,\" meaning its $ρ$-value is near $1$. The goal is to accept inputs $x \\sim D$ and reject potentially adversarially generated inputs $x$ with $ρ(x)$ far from $1$. The objective of quality control is thus weaker than either component problem: testing for \"$ρ(x) \\approx 1$\" or testing if $x \\sim D$, and offers the possibility of more efficient algorithms. In this work, we consider the sublinear version of the quality control problem, where $D \\in Δ(\\{0,1\\}^N)$ and the goal is to solve the $(D ,ρ)$-quality problem with $o(N)$ queries and time. As a case study, we consider random graphs, i.e., $D = G_{n,p}$ (and $N = \\binom{n}2$), and the $k$-clique count function $ρ_k := C_k(G)/\\mathbb{E}_{G' \\sim G_{n,p}}[C_k(G')]$, where $C_k(G)$ is the number of $k$-cliques in $G$. Testing if $G \\sim G_{n,p}$ with one sample, let alone with sublinear query access to the sample, is of course impossible. Testing if $ρ_k(G)\\approx 1$ requires $p^{-Ω(k^2)}$ samples. In contrast, we show that the quality control problem for $G_{n,p}$ (with $n \\geq p^{-ck}$ for some constant $c$) with respect to $ρ_k$ can be tested with $p^{-O(k)}$ queries and time, showing quality control is provably superpolynomially more efficient in this setting. More generally, for a motif $H$ of maximum degree $Δ(H)$, the respective quality control problem can be solved with $p^{-O(Δ(H))}$ queries and running time.","short_abstract":"Many algorithms are designed to work well on average over inputs. When running such an algorithm on an arbitrary input, we must ask: Can we trust the algorithm on this input? We identify a new class of algorithmic problems addressing this, which we call \"Quality Control Problems.\" These problems are specified by a (pos...","url_abs":"https://arxiv.org/abs/2508.16531","url_pdf":"https://arxiv.org/pdf/2508.16531v2","authors":"[\"Cassandra Marcussen\",\"Ronitt Rubinfeld\",\"Madhu Sudan\"]","published":"2025-08-22T16:54:18Z","proceeding":"cs.DS","tasks":"[\"cs.DS\",\"cs.LG\",\"math.CO\",\"math.PR\"]","methods":"[]","has_code":false}
