{"ID":2854520,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.14459","arxiv_id":"2510.14459","title":"Holdout-Loss-Based Data Selection for LLM Finetuning via In-Context Learning","abstract":"Fine-tuning large pretrained language models is a common approach for aligning them with human preferences, but noisy or off-target examples can dilute supervision. While small, well-chosen datasets often match the performance of much larger ones, systematic and efficient ways to identify high-value training data remain underexplored. Many current methods rely on heuristics or expensive retraining. We present a principled, resource-efficient framework for data selection and reweighting. At its core is an In-Context Approximation (ICA) that estimates the holdout loss a model would incur after training on a candidate example by conditioning on a small, curated holdout set in context. ICA requires no reference model and no additional finetuning. We define the resulting estimate as the ICA score, and derive per-example weights that dynamically reweight gradient updates as model parameters evolve. Across SFT, DPO, and SimPO, and over diverse backbones and datasets, ICA-based reweighting consistently improves model alignment with minimal overhead. We analyze sensitivity to score update frequency and the number of in-context holdout examples. We also discuss limitations in rapidly drifting on-policy settings, highlighting directions for future work. Code and prompts will be released.","short_abstract":"Fine-tuning large pretrained language models is a common approach for aligning them with human preferences, but noisy or off-target examples can dilute supervision. While small, well-chosen datasets often match the performance of much larger ones, systematic and efficient ways to identify high-value training data remai...","url_abs":"https://arxiv.org/abs/2510.14459","url_pdf":"https://arxiv.org/pdf/2510.14459v2","authors":"[\"Ling Zhang\",\"Xianliang Yang\",\"Juwon Yu\",\"Park Cheonyoung\",\"Miran Lee\",\"Lei Song\",\"Jiang Bian\"]","published":"2025-10-16T09:00:39Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
