{"ID":2870752,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.11602","arxiv_id":"2509.11602","title":"On the Smallest Size of Internal Collage Systems","abstract":"A Straight-Line Program (SLP) for a string $T$ is a context-free grammar in Chomsky normal form that derives $T$ only, which can be seen as a compressed form of $T$. Kida et al.\\ introduced collage systems [Theor. Comput. Sci., 2003] to generalize SLPs by adding repetition rules and truncation rules. The smallest size $c(T)$ of collage systems for $T$ has gained attention to see how these generalized rules improve the compression ability of SLPs. Navarro et al. [IEEE Trans. Inf. Theory, 2021] showed that $c(T) \\in O(z(T))$ and there is a string family with $c(T) \\in Ω(b(T) \\log |T|)$, where $z(T)$ is the number of phrases in the Lempel-Ziv parsing of $T$ and $b(T)$ is the smallest size of bidirectional schemes for $T$. They also introduced a subclass of collage systems, called internal collage systems, and proved that its smallest size $\\hat{c}(T)$ for $T$ is at least $b(T)$. While $c(T) \\le \\hat{c}(T)$ is obvious, it is unknown how large $\\hat{c}(T)$ is compared to $c(T)$. In this paper, we prove that $\\hat{c}(T) = Θ(c(T))$ by showing that any collage system of size $m$ can be transformed into an internal collage system of size $O(m)$ in $O(m^2)$ time. Thanks to this result, we can focus on internal collage systems to study the asymptotic behavior of $c(T)$, which helps to suppress excess use of truncation rules. As a direct application, we get $b(T) = O(c(T))$, which answers an open question posed in [Navarro et al., IEEE Trans. Inf. Theory, 2021]. We also give a MAX-SAT formulation to compute $\\hat{c}(T)$ for a given $T$.","short_abstract":"A Straight-Line Program (SLP) for a string $T$ is a context-free grammar in Chomsky normal form that derives $T$ only, which can be seen as a compressed form of $T$. Kida et al.\\ introduced collage systems [Theor. Comput. Sci., 2003] to generalize SLPs by adding repetition rules and truncation rules. The smallest size...","url_abs":"https://arxiv.org/abs/2509.11602","url_pdf":"https://arxiv.org/pdf/2509.11602v2","authors":"[\"Soichiro Migita\",\"Kyotaro Uehata\",\"Tomohiro I\"]","published":"2025-09-15T05:34:52Z","proceeding":"cs.DS","tasks":"[\"cs.DS\"]","methods":"[]","has_code":false}
