{"ID":2849570,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.23198","arxiv_id":"2510.23198","title":"PTPP-Aware Adaptation Scaling Laws: Predicting Domain-Adaptation Performance at Unseen Pre-Training Budgets","abstract":"Continual pre-training (CPT) for domain adaptation must balance target-domain gains with stability on the base domain. Existing CPT scaling laws typically assume a fixed pre-training budget, which limits their ability to forecast adaptation outcomes for models trained at different tokens-per-parameter (PTPP). We present \\emph{PTPP-aware} adaptation scaling laws that make the pre-training budget an explicit variable, enabling accurate \\emph{prediction} of adaptation loss at unseen \\ptpp. On a multilingual setup (English/Arabic $\\rightarrow$ French), PTPP-aware formulations trained on early stages (\\ptpp{}=\\{15,31\\}) predict target loss at \\ptpp{}=279 and outperform a PTPP-agnostic \\dcpt{} transfer baseline on metrics (Huber-on-log, MAE$_\\mathrm{rel}$, calibration slope); full diagnostics (RMSE, MAPE) are in the appendix. Beyond forecasting, we show a practical use case: planning replay ratios and adaptation token budgets that satisfy target and forgetting constraints under compute limits.","short_abstract":"Continual pre-training (CPT) for domain adaptation must balance target-domain gains with stability on the base domain. Existing CPT scaling laws typically assume a fixed pre-training budget, which limits their ability to forecast adaptation outcomes for models trained at different tokens-per-parameter (PTPP). We presen...","url_abs":"https://arxiv.org/abs/2510.23198","url_pdf":"https://arxiv.org/pdf/2510.23198v1","authors":"[\"Etienne Goffinet\",\"Shane Bergsma\",\"Avraham Sheinin\",\"Natalia Vassilieva\",\"Shaheer Muhammad\",\"Preslav Nakov\",\"Gurpreet Gosal\"]","published":"2025-10-27T10:36:15Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.CL\"]","methods":"[]","has_code":false}