{"ID":2838528,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.17100","arxiv_id":"2511.17100","title":"Geometric-disentangelment Unlearning","abstract":"Large language models (LLMs) can internalize private or harmful content, motivating unlearning that removes a forget set while preserving retaining knowledge. However, forgetting updates often cause collateral degradation on retaining knowledge, creating a persistent trade-off. Existing LLM unlearning methods are often heuristic, and other theoretical approaches rely on offline feature constructions that do not capture update-time forget-retain interaction in LLMs. To address this limitation, we aim to develop an LLM unlearning method that reduces the forget-retain trade-off with theoretical guarantees. We take a first-principles view by formalizing \"no side effects\" as local retain invariance under small parameter updates, and prove an equivalence under optimizer-induced geometry: the retain loss is locally invariant if and only if the update direction is orthogonal to the subspace spanned by retain gradients. Based on the insight, we propose Geometric-disentanglement Unlearning (GU), a lightweight and theoretically grounded projection that can be plug-and-play to existing gradient-based unlearning methods to mitigate forget-retain side effects. Experiments on TOFU, MUSE, and WMDP-cyber show that GU strengthens forgetting while reducing retain drift. When added to SimNPO, it achieves up to 62\\% improved forgetting Extraction Strength (ES) and 31\\% higher retain ES. We open-sourced our code in https://github.com/Lemutisme/Geometric-Unlearning.","short_abstract":"Large language models (LLMs) can internalize private or harmful content, motivating unlearning that removes a forget set while preserving retaining knowledge. However, forgetting updates often cause collateral degradation on retaining knowledge, creating a persistent trade-off. Existing LLM unlearning methods are often...","url_abs":"https://arxiv.org/abs/2511.17100","url_pdf":"https://arxiv.org/pdf/2511.17100v4","authors":"[\"Duo Zhou\",\"Yuji Zhang\",\"Tianxin Wei\",\"Ruizhong Qiu\",\"Ke Yang\",\"Xiao Lin\",\"Cheng Qian\",\"Jingrui He\",\"Hanghang Tong\",\"Chengxiang Zhai\",\"Heng Ji\",\"Huan Zhang\"]","published":"2025-11-21T09:58:25Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":606784,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2838528,"paper_url":"https://arxiv.org/abs/2511.17100","paper_title":"Geometric-disentangelment Unlearning","repo_url":"https://github.com/Lemutisme/Geometric-Unlearning","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
