{"ID":2849226,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.24621","arxiv_id":"2510.24621","title":"Coreset for Robust Geometric Median: Eliminating Size Dependency on Outliers","abstract":"We study the robust geometric median problem in Euclidean space $\\mathbb{R}^d$, with a focus on coreset construction.A coreset is a compact summary of a dataset $P$ of size $n$ that approximates the robust cost for all centers $c$ within a multiplicative error $\\varepsilon$. Given an outlier count $m$, we construct a coreset of size $\\tilde{O}(\\varepsilon^{-2} \\cdot \\min\\{\\varepsilon^{-2}, d\\})$ when $n \\geq 4m$, eliminating the $O(m)$ dependency present in prior work [Huang et al., 2022 \u0026 2023]. For the special case of $d = 1$, we achieve an optimal coreset size of $\\tildeΘ(\\varepsilon^{-1/2} + \\frac{m}{n} \\varepsilon^{-1})$, revealing a clear separation from the vanilla case studied in [Huang et al., 2023; Afshani and Chris, 2024]. Our results further extend to robust $(k,z)$-clustering in various metric spaces, eliminating the $m$-dependence under mild data assumptions. The key technical contribution is a novel non-component-wise error analysis, enabling substantial reduction of outlier influence, unlike prior methods that retain them.Empirically, our algorithms consistently outperform existing baselines in terms of size-accuracy tradeoffs and runtime, even when data assumptions are violated across a wide range of datasets.","short_abstract":"We study the robust geometric median problem in Euclidean space $\\mathbb{R}^d$, with a focus on coreset construction.A coreset is a compact summary of a dataset $P$ of size $n$ that approximates the robust cost for all centers $c$ within a multiplicative error $\\varepsilon$. Given an outlier count $m$, we construct a c...","url_abs":"https://arxiv.org/abs/2510.24621","url_pdf":"https://arxiv.org/pdf/2510.24621v1","authors":"[\"Ziyi Fang\",\"Lingxiao Huang\",\"Runkai Yang\"]","published":"2025-10-28T16:49:03Z","proceeding":"cs.DS","tasks":"[\"cs.DS\",\"cs.CG\",\"cs.LG\",\"stat.ML\"]","methods":"[]","has_code":false}
