{"ID":2838370,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.18207","arxiv_id":"2511.18207","title":"ProHD: Projection-Based Hausdorff Distance Approximation","abstract":"The Hausdorff distance (HD) is a robust measure of set dissimilarity, but computing it exactly on large, high-dimensional datasets is prohibitively expensive. We propose \\textbf{ProHD}, a projection-guided approximation algorithm that dramatically accelerates HD computation while maintaining high accuracy. ProHD identifies a small subset of candidate \"extreme\" points by projecting the data onto a few informative directions (such as the centroid axis and top principal components) and computing the HD on this subset. This approach guarantees an underestimate of the true HD with a bounded additive error and typically achieves results within a few percent of the exact value. In extensive experiments on image, physics, and synthetic datasets (up to two million points in $D=256$), ProHD runs 10--100$\\times$ faster than exact algorithms while attaining 5--20$\\times$ lower error than random sampling-based approximations. Our method enables practical HD calculations in scenarios like large vector databases and streaming data, where quick and reliable set distance estimation is needed.","short_abstract":"The Hausdorff distance (HD) is a robust measure of set dissimilarity, but computing it exactly on large, high-dimensional datasets is prohibitively expensive. We propose \\textbf{ProHD}, a projection-guided approximation algorithm that dramatically accelerates HD computation while maintaining high accuracy. ProHD identi...","url_abs":"https://arxiv.org/abs/2511.18207","url_pdf":"https://arxiv.org/pdf/2511.18207v1","authors":"[\"Jiuzhou Fu\",\"Luanzheng Guo\",\"Nathan R. Tallent\",\"Dongfang Zhao\"]","published":"2025-11-22T22:44:46Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.LG\"]","methods":"[]","has_code":false}
