{"ID":2864846,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.23472","arxiv_id":"2509.23472","title":"Memory-Efficient Fine-Tuning via Low-Rank Activation Compression","abstract":"The parameter-efficient fine-tuning paradigm has garnered significant attention with the advancement of foundation models. Although numerous methods have been proposed to reduce the number of trainable parameters, their substantial memory overhead remains a critical bottleneck that hinders practical deployment. In this paper, we observe that model activations constitute a major source of memory consumption, especially under large batch sizes and long context lengths; however, the rank of the activations remains consistently low. Motivated by this insight, we propose a memory-efficient fine-tuning approach Low-Rank Activation Compression (LoRAct). Unlike prior work, LoRAct provides a more flexible and versatile compressing strategy that can be applied online during the forward pass without the need for any calibration data. Moreover, LoRAct incorporates a novel sampling-based orthogonal decomposition algorithm specifically designed for low-rank matrices, offering improved computational efficiency and a tighter error bound compared to the widely used RSVD. Experiments on both vision and language tasks demonstrate the effectiveness of LoRAct. Notably, LoRAct further reduces activation memory by approximately 80% in comparison with the widely adopted LoRA method, while maintaining competitive performance. The source code is available at https://github.com/shijxcs/meft.","short_abstract":"The parameter-efficient fine-tuning paradigm has garnered significant attention with the advancement of foundation models. Although numerous methods have been proposed to reduce the number of trainable parameters, their substantial memory overhead remains a critical bottleneck that hinders practical deployment. In this...","url_abs":"https://arxiv.org/abs/2509.23472","url_pdf":"https://arxiv.org/pdf/2509.23472v1","authors":"[\"Jiang-Xin Shi\",\"Wen-Da Wei\",\"Jin-Fei Qi\",\"Xuanyu Chen\",\"Tong Wei\",\"Yu-Feng Li\"]","published":"2025-09-27T19:48:32Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"LoRA\"]","has_code":false,"code_links":[{"ID":609206,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2864846,"paper_url":"https://arxiv.org/abs/2509.23472","paper_title":"Memory-Efficient Fine-Tuning via Low-Rank Activation Compression","repo_url":"https://github.com/shijxcs/meft","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
