New Algorithmic Directions in Optimal Transport and Applications for Product Spaces

cs.DS arXiv:2509.21502
View PDF arXiv JSON

Abstract

We study optimal transport between two high-dimensional distributions $μ,ν$ in $R^n$ from an algorithmic perspective: given $x \sim μ$, find a close $y \sim ν$ in $poly(n)$ time, where $n$ is the dimension of $x,y$. Thus, running time depends on the dimension rather than the full representation size of $μ,ν$. Our main result is a general algorithm for transporting any product distribution $μ$ to any $ν$ with cost $Δ+ δ$ under $\ell_p^p$, where $Δ$ is the Knothe-Rosenblatt transport cost and $δ$ is a computational error decreasing with runtime. This requires $ν$ to be "sequentially samplable" with bounded average sampling cost, a new but natural notion. We further prove: An algorithmic version of Talagrand's inequality for transporting the standard Gaussian $Φ^n$ to arbitrary $ν$ under squared Euclidean cost. For $ν= Φ^n$ conditioned on a set $\mathcal{S}$ of measure $\varepsilon$, we construct the sequential sampler in expected time $poly(n/\varepsilon)$ using membership oracle access to $\mathcal{S}$. This yields an algorithmic transport from $Φ^n$ to $Φ^n|\mathcal{S}$ in $poly(n/\varepsilon)$ time and expected squared distance $O(\log 1/\varepsilon)$, optimal for general $\mathcal{S}$ of measure $\varepsilon$. As corollary, we obtain the first computational concentration result (Etesami et al. SODA 2020) for Gaussian measure under Euclidean distance with dimension-independent transportation cost, resolving an open question of Etesami et al. Specifically, for any $\mathcal{S}$ of Gaussian measure $\varepsilon$, most $Φ^n$ samples can be mapped to $\mathcal{S}$ within distance $O(\sqrt{\log 1/\varepsilon})$ in $poly(n/\varepsilon)$ time.

PDF Viewer