{"ID":2880962,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.12627","arxiv_id":"2508.12627","title":"On computing and the complexity of computing higher-order $U$-statistics, exactly","abstract":"Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper aims to fill this gap by presenting several results related to the computational aspect of $U$-statistics. First, we derive a useful decomposition from a $m$-th order $U$-statistic to a linear combination of $V$-statistics with orders not exceeding $m$, which are generally more feasible to compute. Second, we explore the connection between exactly computing $V$-statistics and Einstein summation, a tool often used in computational mathematics and quantum computing to accelerate tensor computations. Third, we provide an optimistic estimate of the time complexity for exactly computing $U$-statistics, based on the treewidth of a particular graph associated with the $U$-statistic kernel. The above ingredients lead to (1) a new, much more runtime-efficient algorithm to exactly compute general higher-order $U$-statistics, and (2) a more streamlined characterization of runtime complexity of computing $U$-statistics. We develop an accompanying open-source package called \\texttt{u-stats} in both Python (https://github.com/zrq1706/U-Statistics-Python) and R (https://github.com/cxy0714/U-Statistics-R). We demonstrate through three examples in statistics that \\texttt{u-stats} achieves impressive runtime performance compared to existing benchmarks. This paper also aspires to achieve two goals: (1) to capture the interest of researchers in both statistics and other related areas to further advance the algorithmic development of $U$-statistics and (2) to lift the burden of implementing higher-order $U$-statistics from practitioners.","short_abstract":"Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their computational complexity is surprisingly lacking. This paper aims to fill this ga...","url_abs":"https://arxiv.org/abs/2508.12627","url_pdf":"https://arxiv.org/pdf/2508.12627v2","authors":"[\"Xingyu Chen\",\"Ruiqi Zhang\",\"Lin Liu\"]","published":"2025-08-18T05:01:10Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.DS\",\"math.NA\",\"stat.CO\",\"stat.ME\"]","methods":"[]","has_code":false,"code_links":[{"ID":610747,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2880962,"paper_url":"https://arxiv.org/abs/2508.12627","paper_title":"On computing and the complexity of computing higher-order $U$-statistics, exactly","repo_url":"https://github.com/zrq1706/U-Statistics-Python","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0},{"ID":610748,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2880962,"paper_url":"https://arxiv.org/abs/2508.12627","paper_title":"On computing and the complexity of computing higher-order $U$-statistics, exactly","repo_url":"https://github.com/cxy0714/U-Statistics-R","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
