{"ID":2868838,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.15974","arxiv_id":"2509.15974","title":"BEFT: Bias-Efficient Fine-Tuning of Language Models in Low-Data Regimes","abstract":"Fine-tuning the bias terms of large language models (LLMs) has the potential to achieve unprecedented parameter efficiency while maintaining competitive performance, particularly in low-data regimes. However, the link between fine-tuning different bias terms (i.e., $\\boldsymbol{b}_q$, $\\boldsymbol{b}_k$, and $\\boldsymbol{b}_v$ in the query, key, or value projections) and downstream performance remains largely unclear to date. In this paper, we investigate the link between fine-tuning $\\boldsymbol{b}_q$, $\\boldsymbol{b}_k$, and $\\boldsymbol{b}_v$ with the performance of the downstream task. Our key finding is that directly fine-tuning $\\boldsymbol{b}_v$ generally leads to higher downstream performance in low-data regimes, in comparison to $\\boldsymbol{b}_q$ and $\\boldsymbol{b}_k$. We extensively evaluate this unique property across a wide range of LLMs spanning encoder-only and decoder-only architectures up to 6.7B parameters (including bias-free LLMs). Our results provide strong evidence for the effectiveness of directly fine-tuning $\\boldsymbol{b}_v$ across various downstream tasks. The implementation code is available at https://github.com/whubaichuan/BEFT.","short_abstract":"Fine-tuning the bias terms of large language models (LLMs) has the potential to achieve unprecedented parameter efficiency while maintaining competitive performance, particularly in low-data regimes. However, the link between fine-tuning different bias terms (i.e., $\\boldsymbol{b}_q$, $\\boldsymbol{b}_k$, and $\\boldsymb...","url_abs":"https://arxiv.org/abs/2509.15974","url_pdf":"https://arxiv.org/pdf/2509.15974v2","authors":"[\"Baichuan Huang\",\"Ananth Balashankar\",\"Amir Aminifar\"]","published":"2025-09-19T13:35:07Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":609628,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2868838,"paper_url":"https://arxiv.org/abs/2509.15974","paper_title":"BEFT: Bias-Efficient Fine-Tuning of Language Models in Low-Data Regimes","repo_url":"https://github.com/whubaichuan/BEFT","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}