{"ID":2824753,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.22455","arxiv_id":"2512.22455","title":"AFA-LoRA: Enabling Non-Linear Adaptations in LoRA with Activation Function Annealing","abstract":"Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method. However, its linear adaptation process limits its expressive power. This means there is a gap between the expressive power of linear training and non-linear training. To bridge this gap, we propose AFA-LoRA, a novel training strategy that brings non-linear expressivity to LoRA while maintaining its seamless mergeability. Our key innovation is an annealed activation function that transitions from a non-linear to a linear transformation during training, allowing the adapter to initially adopt stronger representational capabilities before converging to a mergeable linear form. We implement our method on supervised fine-tuning, reinforcement learning, and speculative decoding. The results show that AFA-LoRA reduces the performance gap between LoRA and full-parameter training. This work enables a more powerful and practical paradigm of parameter-efficient adaptation.","short_abstract":"Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method. However, its linear adaptation process limits its expressive power. This means there is a gap between the expressive power of linear training and non-linear training. To bridge this gap, we propose AFA-LoRA, a novel training s...","url_abs":"https://arxiv.org/abs/2512.22455","url_pdf":"https://arxiv.org/pdf/2512.22455v2","authors":"[\"Jiacheng Li\",\"Jianchao Tan\",\"Zhidong Yang\",\"Feiye Huo\",\"Yerui Sun\",\"Yuchen Xie\",\"Xunliang Cai\"]","published":"2025-12-27T04:12:40Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.CL\"]","methods":"[\"Reinforcement Learning\",\"LoRA\"]","has_code":false}
