{"ID":2842546,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.08875","arxiv_id":"2511.08875","title":"New perturbation bounds for low rank approximation of matrices: Beyond Eckart-Young-Mirsky","abstract":"Let $A$ be an $m \\times n$ matrix with rank $r$ and spectral decomposition $A = \\sum_{i=1}^r σ_i u_i v_i^\\top,$ where $σ_i$ are its singular values, ordered decreasingly, and $u_i, v_i$ are the corresponding left and right singular vectors. For a parameter $1 \\le p \\le r$, $A_p := \\sum_{i=1}^p σ_i u_i v_i^\\top$ is the best rank $p$ approximation of $A$. In practice, one often chooses $p$ to be small, leading to the commonly used phrase \"low-rank approximation\". Low-rank approximation plays a central role in data science because it can substantially reduce the dimensionality of the original data, the matrix $A$. For a large data matrix $A$, one typically computes a rank-$p$ approximation $A_p$ for a suitably chosen small $p$, stores $A_p$, and uses it as input for further computations. The reduced dimension of $A_p$ enables faster computations and significant data compression. In practice, noise is inevitable. We often have access only to noisy data $\\tilde A = A + E$, where $E$ represents the noise. Consequently, the low-rank approximation used as input in many downstream tasks is $\\tilde A_p$, the best rank $p$ approximation of $\\tilde A$, rather than $A_p$. Therefore, it is natural and important to estimate the error $ \\| \\tilde A_p - A_p \\|$. This error plays a critical role in estimating the accuracy of the output of any process involving a low-rank approximation of noisy input. In this paper, we develop a new method (based on contour analysis) to bound $\\| \\tilde A_p - A_p \\|$. With this method, we can exploit new parameters that measure the skewness between the noise matrix $E$ and the singular vectors of $A$, avoiding the worst-case analysis used in traditional approaches. In many settings, we obtain notable quantitative improvements compared to classical approaches (using the Eckart-Young-Mirsky theorem or the Davis-Kahan theorem).","short_abstract":"Let $A$ be an $m \\times n$ matrix with rank $r$ and spectral decomposition $A = \\sum_{i=1}^r σ_i u_i v_i^\\top,$ where $σ_i$ are its singular values, ordered decreasingly, and $u_i, v_i$ are the corresponding left and right singular vectors. For a parameter $1 \\le p \\le r$, $A_p := \\sum_{i=1}^p σ_i u_i v_i^\\top$ is the...","url_abs":"https://arxiv.org/abs/2511.08875","url_pdf":"https://arxiv.org/pdf/2511.08875v4","authors":"[\"Phuc Tran\",\"Van Vu\"]","published":"2025-11-12T01:28:51Z","proceeding":"math.NA","tasks":"[\"math.NA\",\"math.OC\",\"math.SP\"]","methods":"[]","has_code":false}
