{"ID":2848928,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.04691","arxiv_id":"2511.04691","title":"A Penny for Your Thoughts: Decoding Speech from Inexpensive Brain Signals","abstract":"We explore whether neural networks can decode brain activity into speech by mapping EEG recordings to audio representations. Using EEG data recorded as subjects listened to natural speech, we train a model with a contrastive CLIP loss to align EEG-derived embeddings with embeddings from a pre-trained transformer-based speech model. Building on the state-of-the-art EEG decoder from Meta, we introduce three architectural modifications: (i) subject-specific attention layers (+0.15% WER improvement), (ii) personalized spatial attention (+0.45%), and (iii) a dual-path RNN with attention (-1.87%). Two of the three modifications improved performance, highlighting the promise of personalized architectures for brain-to-speech decoding and applications in brain-computer interfaces.","short_abstract":"We explore whether neural networks can decode brain activity into speech by mapping EEG recordings to audio representations. Using EEG data recorded as subjects listened to natural speech, we train a model with a contrastive CLIP loss to align EEG-derived embeddings with embeddings from a pre-trained transformer-based...","url_abs":"https://arxiv.org/abs/2511.04691","url_pdf":"https://arxiv.org/pdf/2511.04691v1","authors":"[\"Quentin Auster\",\"Kateryna Shapovalenko\",\"Chuang Ma\",\"Demaio Sun\"]","published":"2025-10-28T06:02:41Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.AI\",\"cs.CL\",\"cs.HC\",\"eess.AS\",\"q-bio.NC\"]","methods":"[\"Transformer\"]","has_code":false}