{"ID":2834780,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00694","arxiv_id":"2512.00694","title":"Affordance-First Decomposition for Continual Learning in Video-Language Understanding","abstract":"Continual learning for video--language understanding is increasingly important as models face non-stationary data, domains, and query styles, yet prevailing solutions blur what should stay stable versus what should adapt, rely on static routing/capacity, or require replaying past videos. We aim to explicitly specify where stability lives and where plasticity should be focused under realistic memory and privacy constraints. We introduce Affordance-First Decomposition (AFD): videos are mapped to slowly varying affordance tokens that form a shared, time-aligned substrate, while a lightweight, query-routed, conflict-aware scheduler concentrates adaptation and grows capacity only when needed. The substrate is stabilized via weak alignment and teacher consistency, and training uses question-only replay. AFD achieves state-of-the-art across protocols: 51.6% average accuracy with -1.8% forgetting on domain-incremental VideoQA, ViLCo R@1@0.5 of 29.6% (MQ) and 20.7% (NLQ) with 18.4% stAP@0.25 (VQ), and 39.5% accuracy with -1.6% forgetting on time-incremental iVQA. Overall, AFD offers an explicit, interpretable split between a stable interaction-centered substrate and targeted adaptation.","short_abstract":"Continual learning for video--language understanding is increasingly important as models face non-stationary data, domains, and query styles, yet prevailing solutions blur what should stay stable versus what should adapt, rely on static routing/capacity, or require replaying past videos. We aim to explicitly specify wh...","url_abs":"https://arxiv.org/abs/2512.00694","url_pdf":"https://arxiv.org/pdf/2512.00694v1","authors":"[\"Mengzhu Xu\",\"Hanzhi Liu\",\"Ningkang Peng\",\"Qianyu Chen\",\"Canran Xiao\"]","published":"2025-11-30T02:04:39Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
