{"ID":2880882,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.14285","arxiv_id":"2508.14285","title":"Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning","abstract":"Fine-tuning large language models (LLMs) with low-rank adaptation (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, when a problem requires incorporating information from multiple datasets - as in few shot learning - generalization across datasets can be limited, driving up training costs. As a consequence, other approaches such as in-context learning are typically used in this setting. To address this challenge, we introduce an efficient method for adapting the weights of LLMs to multiple distributions, Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs by reframing where local and global variables are defined in LoRA and using a new hyperparameter to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL supports effective generalization across datasets and scales to large models such as Llama3-8B and Qwen2-7B, outperforming existing methods on the CrossFit and Unified-QA datasets in terms of both accuracy and expected calibration error. We show that meta-learning can also be combined with in-context learning, resulting in further improvements in both these datasets and legal and chemistry applications.","short_abstract":"Fine-tuning large language models (LLMs) with low-rank adaptation (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, when a problem requires incorporating information from multiple datasets - as in few shot learning - generalization across datasets can be limited, driving up tra...","url_abs":"https://arxiv.org/abs/2508.14285","url_pdf":"https://arxiv.org/pdf/2508.14285v3","authors":"[\"Liyi Zhang\",\"Jake Snell\",\"Thomas L. Griffiths\"]","published":"2025-08-19T21:57:59Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"stat.ML\"]","methods":"[\"Large Language Model\",\"Language Model\",\"LoRA\"]","has_code":false}
