{"ID":2851870,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.19820","arxiv_id":"2510.19820","title":"Tight Lower Bounds for Central String Queries in Compressed Space","abstract":"In this work, we study the limits of compressed data structures, i.e., structures that support various queries on an input text $T\\inΣ^n$ using space proportional to the size of $T$ in compressed form. Nearly all fundamental queries can currently be efficiently supported in $O(δ(T)\\log^{O(1)}n)$ space, where $δ(T)$ is the substring complexity, a strong compressibility measure that lower-bounds the optimal space to represent the text [Kociumaka, Navarro, Prezza, IEEE Trans. Inf. Theory 2023]. However, optimal query time has been characterized only for random access. We address this gap by developing tight lower bounds for nearly all other fundamental queries: (1) We prove that suffix array (SA), inverse suffix array (SA$^{-1}$), longest common prefix (LCP) array, and longest common extension (LCE) queries all require $Ω(\\log n/\\log\\log n)$ time within $O(δ(T)\\log^{O(1)}n)$ space, matching known upper bounds. (2) We further show that other common queries, currently supported in $O(\\log\\log n)$ time and $O(δ(T)\\log^{O(1)}n)$ space, including the Burrows-Wheeler Transform (BWT), permuted longest common prefix (PLCP) array, Last-to-First (LF), inverse LF, lexicographic predecessor ($Φ$), and inverse $Φ$ queries, all require $Ω(\\log\\log n)$ time, yielding another set of tight bounds. Our lower bounds hold even for texts over a binary alphabet. This work establishes a clean dichotomy: the optimal time complexity to support central string queries in compressed space is either $Θ(\\log n/\\log\\log n)$ or $Θ(\\log\\log n)$. This completes the theoretical foundation of compressed indexing, closing a crucial gap between upper and lower bounds and providing a clear target for future data structures: seeking either the optimal time in the smallest space or the fastest time in the optimal space, both of which are now known for central string queries.","short_abstract":"In this work, we study the limits of compressed data structures, i.e., structures that support various queries on an input text $T\\inΣ^n$ using space proportional to the size of $T$ in compressed form. Nearly all fundamental queries can currently be efficiently supported in $O(δ(T)\\log^{O(1)}n)$ space, where $δ(T)$ is...","url_abs":"https://arxiv.org/abs/2510.19820","url_pdf":"https://arxiv.org/pdf/2510.19820v1","authors":"[\"Dominik Kempa\",\"Tomasz Kociumaka\"]","published":"2025-10-22T17:57:36Z","proceeding":"cs.DS","tasks":"[\"cs.DS\"]","methods":"[]","has_code":false}
