{"ID":2879489,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.16463","arxiv_id":"2508.16463","title":"Modular Embedding Recomposition for Incremental Learning","abstract":"The advent of pre-trained Vision-Language Models (VLMs) has significantly transformed Continual Learning (CL), mainly due to their zero-shot classification abilities. Such proficiency makes VLMs well-suited for real-world applications, enabling robust performance on novel unseen classes without requiring adaptation. However, fine-tuning remains essential when downstream tasks deviate significantly from the pre-training domain. Prior CL approaches primarily focus on preserving the zero-shot capabilities of VLMs during incremental fine-tuning on a downstream task. We take a step further by devising an approach that transforms preservation into enhancement of the zero-shot capabilities of VLMs. Our approach, named MoDular Embedding Recomposition (MoDER), introduces a modular framework that trains multiple textual experts, each specialized in a single seen class, and stores them in a foundational hub. At inference time, for each unseen class, we query the hub and compose the retrieved experts to synthesize a refined prototype that improves classification. We show the effectiveness of our method across two popular zero-shot incremental protocols, Class-IL and MTIL, comprising a total of 14 datasets. The codebase is available at https://github.com/aimagelab/mammoth.","short_abstract":"The advent of pre-trained Vision-Language Models (VLMs) has significantly transformed Continual Learning (CL), mainly due to their zero-shot classification abilities. Such proficiency makes VLMs well-suited for real-world applications, enabling robust performance on novel unseen classes without requiring adaptation. Ho...","url_abs":"https://arxiv.org/abs/2508.16463","url_pdf":"https://arxiv.org/pdf/2508.16463v2","authors":"[\"Aniello Panariello\",\"Emanuele Frascaroli\",\"Pietro Buzzega\",\"Lorenzo Bonicelli\",\"Angelo Porrello\",\"Simone Calderara\"]","published":"2025-08-22T15:25:40Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false,"code_links":[{"ID":610589,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2879489,"paper_url":"https://arxiv.org/abs/2508.16463","paper_title":"Modular Embedding Recomposition for Incremental Learning","repo_url":"https://github.com/aimagelab/mammoth","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
