{"ID":2893180,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.14060","arxiv_id":"2507.14060","title":"Sparse Navigable Graphs for Nearest Neighbor Search: Algorithms and Hardness","abstract":"We initiate the study of approximation algorithms and computational barriers for constructing sparse $α$-navigable graphs [IX23, DGM+24], a core primitive underlying recent advances in graph-based nearest neighbor search. Given an $n$-point dataset $P$ with an associated metric $\\mathsf{d}$ and a parameter $α\\geq 1$, the goal is to efficiently build the sparsest graph $G=(P, E)$ that is $α$-navigable: for every distinct $s, t \\in P$, there exists an edge $(s, u) \\in E$ with $\\mathsf{d}(u, t) \u003c \\mathsf{d}(s, t)/α$. We consider two natural sparsity objectives: minimizing the maximum out-degree and minimizing the total size. We first show a strong negative result: the slow-preprocessing version of DiskANN (analyzed in [IX23] for low-doubling metrics) can yield solutions whose sparsity is $\\widetildeΩ(n)$ times larger than optimal, even on Euclidean instances. We then show a tight approximation-preserving equivalence between the Sparsest Navigable Graph problem and the classic Set Cover problem, obtaining an $O(n^3)$-time $(\\ln n + 1)$-approximation algorithm, as well as establishing NP-hardness of achieving an $o(\\ln n)$-approximation. Building on this equivalence, we develop faster $O(\\ln n)$-approximation algorithms. The first runs in $\\widetilde{O}(n \\cdot \\mathrm{OPT})$ time and is thus much faster when the optimal solution is sparse. The second, based on fast matrix multiplication, is a bicriteria algorithm that computes an $O(\\ln n)$-approximation to the sparsest $2α$-navigable graph, running in $\\widetilde{O}(n^ω)$ time. Finally, we complement our upper bounds with a query complexity lower bound, showing that any $o(n)$-approximation requires examining $Ω(n^2)$ distances. This result shows that in the regime where $\\mathrm{OPT} = \\widetilde{O}(n)$, our $\\widetilde{O}(n \\cdot \\mathrm{OPT})$-time algorithm is essentially best possible.","short_abstract":"We initiate the study of approximation algorithms and computational barriers for constructing sparse $α$-navigable graphs [IX23, DGM+24], a core primitive underlying recent advances in graph-based nearest neighbor search. Given an $n$-point dataset $P$ with an associated metric $\\mathsf{d}$ and a parameter $α\\geq 1$, t...","url_abs":"https://arxiv.org/abs/2507.14060","url_pdf":"https://arxiv.org/pdf/2507.14060v1","authors":"[\"Sanjeev Khanna\",\"Ashwin Padaki\",\"Erik Waingarten\"]","published":"2025-07-18T16:41:21Z","proceeding":"cs.DS","tasks":"[\"cs.DS\"]","methods":"[]","has_code":false}
