{"ID":2828174,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.15684","arxiv_id":"2512.15684","title":"High-Dimensional Partial Least Squares: Spectral Analysis and Fundamental Limitations","abstract":"Partial Least Squares (PLS) is a widely used method for data integration, designed to extract latent components shared across paired high-dimensional datasets. Despite decades of practical success, a precise theoretical understanding of its behavior in high-dimensional regimes remains limited. In this paper, we study a data integration model in which two high-dimensional data matrices share a low-rank common latent structure while also containing individual-specific components. We analyze the singular vectors of the associated cross-covariance matrix using tools from random matrix theory and derive asymptotic characterizations of the alignment between estimated and true latent directions. These results provide a quantitative explanation of the reconstruction performance of the PLS variant based on Singular Value Decomposition (PLS-SVD) and identify regimes where the method exhibits counter-intuitive or limiting behavior. Building on this analysis, we compare PLS-SVD with principal component analysis applied separately to each dataset and show its asymptotic superiority in detecting the common latent subspace. Overall, our results offer a comprehensive theoretical understanding of high-dimensional PLS-SVD, clarifying both its advantages and fundamental limitations.","short_abstract":"Partial Least Squares (PLS) is a widely used method for data integration, designed to extract latent components shared across paired high-dimensional datasets. Despite decades of practical success, a precise theoretical understanding of its behavior in high-dimensional regimes remains limited. In this paper, we study a...","url_abs":"https://arxiv.org/abs/2512.15684","url_pdf":"https://arxiv.org/pdf/2512.15684v1","authors":"[\"Victor Léger\",\"Florent Chatelain\"]","published":"2025-12-17T18:38:01Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\"]","methods":"[]","has_code":false}
