{"ID":2832153,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.06239","arxiv_id":"2512.06239","title":"LOCUS: A System and Method for Low-Cost Customization for Universal Specialization","abstract":"We present LOCUS (LOw-cost Customization for Universal Specialization), a pipeline that consumes few-shot data to streamline the construction and training of NLP models through targeted retrieval, synthetic data generation, and parameter-efficient tuning. With only a small number of labeled examples, LOCUS discovers pertinent data in a broad repository, synthesizes additional training samples via in-context data generation, and fine-tunes models using either full or low-rank (LoRA) parameter adaptation. Our approach targets named entity recognition (NER) and text classification (TC) benchmarks, consistently outperforming strong baselines (including GPT-4o) while substantially lowering costs and model sizes. Our resultant memory-optimized models retain 99% of fully fine-tuned accuracy while using barely 5% of the memory footprint, also beating GPT-4o on several benchmarks with less than 1% of its parameters.","short_abstract":"We present LOCUS (LOw-cost Customization for Universal Specialization), a pipeline that consumes few-shot data to streamline the construction and training of NLP models through targeted retrieval, synthetic data generation, and parameter-efficient tuning. With only a small number of labeled examples, LOCUS discovers pe...","url_abs":"https://arxiv.org/abs/2512.06239","url_pdf":"https://arxiv.org/pdf/2512.06239v1","authors":"[\"Dhanasekar Sundararaman\",\"Keying Li\",\"Wayne Xiong\",\"Aashna Garg\"]","published":"2025-12-06T01:32:58Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"LoRA\"]","has_code":false}
