{"ID":2878435,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.17742","arxiv_id":"2508.17742","title":"EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models","abstract":"Electroencephalography foundation models (EEG-FMs) have advanced brain signal analysis, but the lack of standardized evaluation benchmarks impedes model comparison and scientific progress. Current evaluations rely on inconsistent protocols that render cross-model comparisons unreliable, while a lack of diagnostic analyses obscures the internal mechanisms driving transfer efficiency and scaling behaviors. To address this, we introduce \\textbf{EEG-FM-Bench}, a unified system for the standardized evaluation of EEG-FMs. The benchmark integrates 14 datasets across 10 paradigms and incorporates diverse experimental settings, including multiple fine-tuning strategies, task organizations, and classifier configurations, supported by tools for gradient and representation analysis. Our experiments and analysis reveal several critical insights: (1) multi-task learning acts as a critical regularizer to mitigate overfitting in data-scarce EEG contexts; (2) pre-training efficiency is currently limited by gradient conflicts between reconstruction objectives and downstream tasks; (3) model scaling deviates from typical laws, as compact architectures with domain-specific inductive biases consistently outperform significantly larger models. This benchmark enables fair comparison and reproducible analysis, shifting the field from fragmented results to interpretable advances. Code is available at https://github.com/xw1216/EEG-FM-Bench.","short_abstract":"Electroencephalography foundation models (EEG-FMs) have advanced brain signal analysis, but the lack of standardized evaluation benchmarks impedes model comparison and scientific progress. Current evaluations rely on inconsistent protocols that render cross-model comparisons unreliable, while a lack of diagnostic analy...","url_abs":"https://arxiv.org/abs/2508.17742","url_pdf":"https://arxiv.org/pdf/2508.17742v2","authors":"[\"Wei Xiong\",\"Jiangtong Li\",\"Jie Li\",\"Kun Zhu\",\"Changjun Jiang\"]","published":"2025-08-25T07:34:33Z","proceeding":"eess.SP","tasks":"[\"eess.SP\",\"cs.AI\",\"cs.HC\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false,"code_links":[{"ID":610481,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2878435,"paper_url":"https://arxiv.org/abs/2508.17742","paper_title":"EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models","repo_url":"https://github.com/xw1216/EEG-FM-Bench","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}