{"ID":2892894,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.14725","arxiv_id":"2507.14725","title":"GRID: Scalable Task-Agnostic Prompt-Based Continual Learning for Language Models","abstract":"Prompt-based continual learning (CL) provides a parameter-efficient approach for adapting large language models (LLMs) across task sequences. However, most existing methods rely on task-aware inference and maintain a growing set of task-specific prompts, which introduces two major challenges: (1) severe performance degradation on earlier tasks under task-agnostic inference, and (2) limited scalability due to prompt memory accumulation as task sequences grow. In this paper, we present GRID, a unified framework designed to address these challenges. GRID incorporates a decoding mechanism that enhances backward transfer by leveraging representative inputs, automatic task identification, and constrained decoding. Furthermore, it employs a gradient-guided prompt selection strategy to compress less informative prompts into a single aggregated representation, ensuring scalable and memory-efficient continual learning. Extensive experiments on long-sequence and negative transfer benchmarks show that GRID improves average accuracy and backward transfer, achieves competitive forward transfer, and substantially reduces prompt memory usage.","short_abstract":"Prompt-based continual learning (CL) provides a parameter-efficient approach for adapting large language models (LLMs) across task sequences. However, most existing methods rely on task-aware inference and maintain a growing set of task-specific prompts, which introduces two major challenges: (1) severe performance deg...","url_abs":"https://arxiv.org/abs/2507.14725","url_pdf":"https://arxiv.org/pdf/2507.14725v3","authors":"[\"Anushka Tiwari\",\"Sayantan Pal\",\"Rohini K. Srihari\",\"Kaiyi Ji\"]","published":"2025-07-19T19:15:03Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
