Hypernetwork-Driven Low-Rank Adaptation Across Attention Heads

cs.LG arXiv:2510.04295
View PDF arXiv JSON

Abstract

Parameter-efficient fine-tuning (PEFT) has emerged as a powerful paradigm for adapting large-scale pre-trained models to downstream tasks with minimal additional parameters. Among PEFT methods, Low-Rank Adaptation (LoRA) stands out for its effectiveness by inserting trainable low-rank matrices into weight updates to enable efficient adaptation. However, when applied to multi-head self-attention, existing LoRA-based methods typically fine-tune each attention head independently, overlooking potential interactions and shared structure among heads. To address this limitation, we propose Hypernetwork-Driven Low-rank Adaptation (HyRA) that employs a hypernetwork to generate joint low-rank matrices for all attention heads within a layer. The shared generator promotes cross-head information sharing, helping low-rank modules avoid the redundant feature learning seen in traditional LoRA methods. Theoretically, our method achieves significantly better sample efficiency compared to standard LoRA. Empirically, we evaluate HyRA on a comprehensive suite of language and vision benchmarks. Our approach consistently outperforms existing parameter-efficient fine-tuning (PEFT) baselines across a wide range of tasks. Notably, in low-data regimes, HyRA achieves substantial improvements over LoRA, underscoring its practical sample efficiency and effectiveness in data-scarce scenarios.

PDF Viewer