{"ID":2862446,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.00079","arxiv_id":"2510.00079","title":"Directed Information $γ$-covering: An Information-Theoretic Framework for Context Engineering","abstract":"We introduce \\textbf{Directed Information $γ$-covering}, a simple but general framework for redundancy-aware context engineering. Directed information (DI), a causal analogue of mutual information, measures asymmetric predictiveness between chunks. If $\\operatorname{DI}_{i \\to j} \\ge H(C_j) - γ$, then $C_i$ suffices to represent $C_j$ up to $γ$ bits. Building on this criterion, we formulate context selection as a $γ$-cover problem and propose a greedy algorithm with provable guarantees: it preserves query information within bounded slack, inherits $(1+\\ln n)$ and $(1-1/e)$ approximations from submodular set cover, and enforces a diversity margin. Importantly, building the $γ$-cover is \\emph{query-agnostic}: it incurs no online cost and can be computed once offline and amortized across all queries. Experiments on HotpotQA show that $γ$-covering consistently improves over BM25, a competitive baseline, and provides clear advantages in hard-decision regimes such as context compression and single-slot prompt selection. These results establish DI $γ$-covering as a principled, self-organizing backbone for modern LLM pipelines.","short_abstract":"We introduce \\textbf{Directed Information $γ$-covering}, a simple but general framework for redundancy-aware context engineering. Directed information (DI), a causal analogue of mutual information, measures asymmetric predictiveness between chunks. If $\\operatorname{DI}_{i \\to j} \\ge H(C_j) - γ$, then $C_i$ suffices to...","url_abs":"https://arxiv.org/abs/2510.00079","url_pdf":"https://arxiv.org/pdf/2510.00079v1","authors":"[\"Hai Huang\"]","published":"2025-09-30T02:41:11Z","proceeding":"cs.IT","tasks":"[\"cs.IT\",\"cs.LG\",\"stat.ML\"]","methods":"[\"Large Language Model\",\"Generative Adversarial Network\"]","has_code":false}
