{"ID":2894382,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.11256","arxiv_id":"2507.11256","title":"Fully Dynamic Euclidean k-Means","abstract":"We consider the Euclidean $k$-means clustering problem in a dynamic setting, where we have to explicitly maintain a solution (a set of $k$ centers) $S \\subseteq \\mathbb{R}^d$ subject to point insertions/deletions in $\\mathbb{R}^d$. We present a dynamic algorithm for Euclidean $k$-means with $\\mathrm{poly}(1/ε)$-approximation ratio, $\\tilde{O}(k^ε)$ update time, and $\\tilde{O}(1)$ recourse, for any $ε\\in (0,1)$, even when $d$ and $k$ are both part of the input. This is the first algorithm to achieve a constant ratio with $o(k)$ update time for this problem, whereas the previous $O(1)$-approximation runs in $\\tilde O(k)$ update time [Bhattacharya, Costa, Farokhnejad; STOC'25]. In fact, previous algorithms cannot go beyond $O(k)$ update time precisely because they are designed for general metrics where an $Ω(k)$ lower bound is known. We break this $O(k)$ barrier by devising new fundamental data structures to utilize Euclidean properties: a structure that (implicitly) maintains a clustering subject to both center and data point updates, and a range query structure that can evaluate a mergeable function over any metric ball range given as a query. To obtain these structures, we devise the first consistent hashing scheme [Czumaj, Jiang, Krauthgamer, Vesel{ý}, Yang; FOCS'22] that achieves $\\tilde O(n^ε)$ running time per point evaluation with competitive parameters. Our final algorithm exploits the framework of [Bhattacharya, Costa, Farokhnejad; STOC'25] for general metrics. The key change is to redesign several critical subroutines so that they reduce to our new Euclidean data structures, replacing the general-metric implementations that are unlikely to run efficiently even when Euclidean properties are provided.","short_abstract":"We consider the Euclidean $k$-means clustering problem in a dynamic setting, where we have to explicitly maintain a solution (a set of $k$ centers) $S \\subseteq \\mathbb{R}^d$ subject to point insertions/deletions in $\\mathbb{R}^d$. We present a dynamic algorithm for Euclidean $k$-means with $\\mathrm{poly}(1/ε)$-approxi...","url_abs":"https://arxiv.org/abs/2507.11256","url_pdf":"https://arxiv.org/pdf/2507.11256v4","authors":"[\"Sayan Bhattacharya\",\"Martín Costa\",\"Ermiya Farokhnejad\",\"Shaofeng H. -C. Jiang\",\"Yaonan Jin\",\"Jianing Lou\"]","published":"2025-07-15T12:30:40Z","proceeding":"cs.DS","tasks":"[\"cs.DS\"]","methods":"[]","has_code":false}
