{"ID":2893081,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.14261","arxiv_id":"2507.14261","title":"FAMST: Fast Approximate Minimum Spanning Tree Construction for Large-Scale and High-Dimensional Data","abstract":"We present Fast Approximate Minimum Spanning Tree (FAMST), a novel algorithm that addresses the computational challenges of constructing Minimum Spanning Trees (MSTs) for large-scale and high-dimensional datasets. FAMST utilizes a three-phase approach: Approximate Nearest Neighbor (ANN) graph construction, ANN inter-component connection, and iterative edge refinement. For a dataset of $n$ points in a $d$-dimensional space, FAMST achieves $\\mathcal{O}(dn \\log n)$ time complexity and $\\mathcal{O}(dn + kn)$ space complexity when $k$ nearest neighbors are considered, which is a significant improvement over the $\\mathcal{O}(n^2)$ time and space complexity of traditional methods. Experiments across diverse datasets demonstrate that FAMST achieves remarkably low approximation errors while providing speedups of up to 1000$\\times$ compared to exact MST algorithms. We analyze how the key hyperparameters, $k$ (neighborhood size) and $λ$ (inter-component edges), affect performance, providing practical guidelines for hyperparameter selection. FAMST enables MST-based analysis on datasets with millions of points and thousands of dimensions, extending the applicability of MST techniques to problem scales previously considered infeasible.","short_abstract":"We present Fast Approximate Minimum Spanning Tree (FAMST), a novel algorithm that addresses the computational challenges of constructing Minimum Spanning Trees (MSTs) for large-scale and high-dimensional datasets. FAMST utilizes a three-phase approach: Approximate Nearest Neighbor (ANN) graph construction, ANN inter-co...","url_abs":"https://arxiv.org/abs/2507.14261","url_pdf":"https://arxiv.org/pdf/2507.14261v1","authors":"[\"Mahmood K. M. Almansoori\",\"Miklos Telek\"]","published":"2025-07-18T12:53:58Z","proceeding":"cs.DS","tasks":"[\"cs.DS\",\"cs.AI\"]","methods":"[]","has_code":false}
