{"ID":2876345,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.00996","arxiv_id":"2509.00996","title":"MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper","abstract":"Considering deep neural networks as manifold mappers, the pretrain-then-fine-tune paradigm can be interpreted as a two-stage process: pretrain establishes a broad knowledge base, and fine-tune adjusts the model parameters to activate specific neural pathways to align with the target manifold. Although prior fine-tuning approaches demonstrate success, their rigid parameter space limits their ability to dynamically activate appropriate neural pathways, rendering them ill-equipped to adapt flexibly to the diverse and evolving data distributions. In light of this view, we propose a novel approach, Mixture of Expert Prompt Tuning (MEPT), as an effective and efficient manifold-mapping framework. MEPT leverages the Mixture of Experts architecture by integrating multiple prompt experts to adaptively learn diverse and non-stationary data distributions. Empirical evaluations demonstrate that MEPT outperforms several state-of-the-art parameter efficient baselines on SuperGLUE, achieving notable improvements in mean accuracy (e.g., 1.94%) while significantly reducing activated prompts by 79.25%. The effectiveness of MEPT is further supported by theoretical insights from manifold learning and validated through neural activation pathway visualization results. Our code is avaliable at https://runjia.tech/emnlp_mept/.","short_abstract":"Considering deep neural networks as manifold mappers, the pretrain-then-fine-tune paradigm can be interpreted as a two-stage process: pretrain establishes a broad knowledge base, and fine-tune adjusts the model parameters to activate specific neural pathways to align with the target manifold. Although prior fine-tuning...","url_abs":"https://arxiv.org/abs/2509.00996","url_pdf":"https://arxiv.org/pdf/2509.00996v2","authors":"[\"Runjia Zeng\",\"Guangyan Sun\",\"Qifan Wang\",\"Tong Geng\",\"Sohail Dianat\",\"Xiaotian Han\",\"Raghuveer Rao\",\"Xueling Zhang\",\"Cheng Han\",\"Lifu Huang\",\"Dongfang Liu\"]","published":"2025-08-31T21:19:25Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.CL\"]","methods":"[\"Mixture of Experts\"]","project_urls":"[\"https://runjia.tech/emnlp_mept/\"]","has_code":false,"code_links":[{"ID":610285,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2876345,"paper_url":"https://arxiv.org/abs/2509.00996","paper_title":"MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper","repo_url":"https://github.com/runtsang/MEPT","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0},{"ID":610286,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2876345,"paper_url":"https://arxiv.org/abs/2509.00996","paper_title":"MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper","repo_url":"https://github.com/nerfies/nerfies.github.io","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
