{"ID":2832091,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.06950","arxiv_id":"2512.06950","title":"PARIS: Pruning Algorithm via the Representer theorem for Imbalanced Scenarios","abstract":"The challenge of \\textbf{imbalanced regression} arises when standard Empirical Risk Minimization (ERM) biases models toward high-frequency regions of the data distribution, causing severe degradation on rare but high-impact ``tail'' events. Existing strategies uch as loss re-weighting or synthetic over-sampling often introduce noise, distort the underlying distribution, or add substantial algorithmic complexity. We introduce \\textbf{PARIS} (Pruning Algorithm via the Representer theorem for Imbalanced Scenarios), a principled framework that mitigates imbalance by \\emph{optimizing the training set itself}. PARIS leverages the representer theorem for neural networks to compute a \\textbf{closed-form representer deletion residual}, which quantifies the exact change in validation loss caused by removing a single training point \\emph{without retraining}. Combined with an efficient Cholesky rank-one downdating scheme, PARIS performs fast, iterative pruning that eliminates uninformative or performance-degrading samples. We use a real-world space weather example, where PARIS reduces the training set by up to 75\\% while preserving or improving overall RMSE, outperforming re-weighting, synthetic oversampling, and boosting baselines. Our results demonstrate that representer-guided dataset pruning is a powerful, interpretable, and computationally efficient approach to rare-event regression.","short_abstract":"The challenge of \\textbf{imbalanced regression} arises when standard Empirical Risk Minimization (ERM) biases models toward high-frequency regions of the data distribution, causing severe degradation on rare but high-impact ``tail'' events. Existing strategies uch as loss re-weighting or synthetic over-sampling often i...","url_abs":"https://arxiv.org/abs/2512.06950","url_pdf":"https://arxiv.org/pdf/2512.06950v1","authors":"[\"Enrico Camporeale\"]","published":"2025-12-07T18:05:20Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\",\"physics.space-ph\"]","methods":"[]","has_code":false}
