{"ID":2872988,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.07599","arxiv_id":"2509.07599","title":"Tight Bounds for Low-Error Frequency Moment Estimation and the Power of Multiple Passes","abstract":"Estimating the second frequency moment $F_2$ of a data stream up to a $(1 \\pm \\varepsilon)$ factor is a central problem in the streaming literature. For errors $\\varepsilon \u003e Ω(1/\\sqrt{n})$, the tight bound $Θ\\left(\\log(\\varepsilon^2 n)/\\varepsilon^2\\right)$ was recently established by Braverman and Zamir. In this work, we complete the picture by resolving the remaining regime of small error, $\\varepsilon \u003c 1/\\sqrt{n}$, showing that the optimal space complexity is $Θ\\left( \\min\\left(n, \\frac{1}{\\varepsilon^2} \\right) \\cdot \\left(1 + \\left| \\log(\\varepsilon^2 n) \\right| \\right) \\right)$ bits for all $\\varepsilon \\geq 1/n^2$, assuming a sufficiently large universe. This closes the gap between the best known $Ω(n)$ lower bound and the straightforward $O(n \\log n)$ upper bound in that range, and shows that essentially storing the entire stream is necessary for high-precision estimation. To derive this bound, we fully characterize the two-party communication complexity of estimating the size of a set intersection up to an arbitrary additive error $\\varepsilon n$. In particular, we prove a tight $Ω(n \\log n)$ lower bound for one-way communication protocols when $\\varepsilon \u003c n^{-1/2-Ω(1)}$, in contrast to classical $O(n)$-bit protocols that use two-way communication. Motivated by this separation, we present a two-pass streaming algorithm that computes the exact histogram of a stream with high probability using only $O(n \\log \\log n)$ bits of space, in contrast to the $Θ(n \\log n)$ bits required in one pass even to approximate $F_2$ with small error. This yields the first asymptotic separation between one-pass and $O(1)$-passes space complexity for small frequency moment estimation.","short_abstract":"Estimating the second frequency moment $F_2$ of a data stream up to a $(1 \\pm \\varepsilon)$ factor is a central problem in the streaming literature. For errors $\\varepsilon \u003e Ω(1/\\sqrt{n})$, the tight bound $Θ\\left(\\log(\\varepsilon^2 n)/\\varepsilon^2\\right)$ was recently established by Braverman and Zamir. In this work...","url_abs":"https://arxiv.org/abs/2509.07599","url_pdf":"https://arxiv.org/pdf/2509.07599v1","authors":"[\"Naomi Green-Maimon\",\"Or Zamir\"]","published":"2025-09-09T11:17:12Z","proceeding":"cs.DS","tasks":"[\"cs.DS\"]","methods":"[]","has_code":false}
