r*-indexing

cs.DS arXiv:2508.12675
View PDF arXiv JSON

Abstract

Let $T [1..n]$ be a text over an alphabet of size $σ\in \mathrm{polylog} (n)$, let $r^*$ be the sum of the numbers of runs in the Burrows-Wheeler Transforms of $T$ and its reverse, and let $z$ be the number of phrases in the LZ77 parse of $T$. We show how to store $T$ in $O (r^* \log (n / r^*) + z \log n)$ bits such that, given a pattern $P [1..m]$, we can report the locations of the $\mathrm{occ}$ occurrences of $P$ in $T$ in $O (m \log n + \mathrm{occ} \log^εn)$ time. We can also report the position of the leftmost and rightmost occurrences of $P$ in $T$ in the same space and $O (m \log^εn)$ time.

PDF Viewer