{"ID":2870025,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.14373","arxiv_id":"2509.14373","title":"CodeLSI: Leveraging Foundation Models for Automated Code Generation with Low-Rank Optimization and Domain-Specific Instruction Tuning","abstract":"Context: Automated code generation using Foundation Models (FMs) offers promising solutions for enhancing software development efficiency. However, challenges remain in ensuring domain specificity, cost-effectiveness, and security - especially when relying on third-party APIs. This paper introduces CodeLSI, a framework that combines low-rank optimization and domain-specific instruction tuning to address these challenges. Objectives: The aim of this study is to develop and evaluate CodeLSI, a novel approach for generating high-quality code tailored to specific domains, using FMs fine-tuned on company infrastructure without dependence on external APIs. Methods: CodeLSI applies low-rank adaptation techniques to reduce the computational cost of model pre-training and fine-tuning. Domain-specific instruction tuning is employed to align code generation with organizational needs. We implemented and tested the framework on real-world JavaScript coding tasks using datasets drawn from internal software projects. Results: Experimental evaluations show that CodeLSI produces high-quality, context aware code. It outperforms baseline models in terms of relevance, accuracy, and domain fit. The use of low-rank optimization significantly reduced resource requirements, enabling scalable training on company-owned infrastructure. Conclusion: CodeLSI demonstrates that combining low-rank optimization with domain specific tuning can enhance the practicality and performance of FMs for automated code generation. This approach provides a secure, cost-efficient alternative to commercial API based solutions and supports faster, more targeted innovation in software development.","short_abstract":"Context: Automated code generation using Foundation Models (FMs) offers promising solutions for enhancing software development efficiency. However, challenges remain in ensuring domain specificity, cost-effectiveness, and security - especially when relying on third-party APIs. This paper introduces CodeLSI, a framework...","url_abs":"https://arxiv.org/abs/2509.14373","url_pdf":"https://arxiv.org/pdf/2509.14373v1","authors":"[\"Huy Le\",\"Phong Nguyen\",\"Hao Do\",\"Tuan Nguyen\",\"Thien Pham\",\"Anh Nguyen-Duc\",\"Tho Quan\"]","published":"2025-09-17T19:20:23Z","proceeding":"cs.SE","tasks":"[\"cs.SE\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
