When to Transfer: Adaptive Source Selection for Positive Transfer in Linear Models

stat.ML arXiv:2510.16986
View PDF arXiv JSON

Abstract

In many business settings, task-specific labeled data are scarce or costly to obtain, limiting supervised learning on a target task. A classical response is transfer learning (TL). Many TL works study how to transfer information from related sources. We study, for linear regression and classification, when to transfer via sample sharing: in a multi-source setting, we greedily decide from which sources and how many samples to incorporate into the target dataset. Our method uses an accept/reject rule based on a data-dependent estimate of the transfer gain, i.e the marginal decrease in target predictive error, computed conditionally on the observed target samples. We analyze our approach and show that how the derived statistical test enforces positive transfer with high probability. Under additional standard conditions, we also study the transfer gain itself and characterize when transfer is beneficial. Experiments on synthetic and real data show consistent gains over classical and recent strong baselines while avoiding negative transfer.

PDF Viewer