{"ID":2842988,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.09690","arxiv_id":"2511.09690","title":"Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages","abstract":"Automatic speech recognition (ASR) has advanced in high-resource languages, but most of the world's 7,000+ languages remain unsupported, leaving thousands of long-tail languages behind. Expanding ASR coverage has been costly and limited by architectures that restrict language support, making extension inaccessible to most--all while entangled with ethical concerns when pursued without community collaboration. To transcend these limitations, we introduce Omnilingual ASR, the first large-scale ASR system designed for extensibility. Omnilingual ASR enables communities to introduce unserved languages with only a handful of data samples. It scales self-supervised pre-training to 7B parameters to learn robust speech representations and introduces an encoder-decoder architecture designed for zero-shot generalization, leveraging a LLM-inspired decoder. This capability is grounded in a massive and diverse training corpus; by combining breadth of coverage with linguistic variety, the model learns representations robust enough to adapt to unseen languages. Incorporating public resources with community-sourced recordings gathered through compensated local partnerships, Omnilingual ASR expands coverage to over 1,600 languages, the largest such effort to date--including over 500 never before served by ASR. Automatic evaluations show substantial gains over prior systems, especially in low-resource conditions, and strong generalization. We release Omnilingual ASR as a family of models, from 300M variants for low-power devices to 7B for maximum accuracy. We reflect on the ethical considerations shaping this design and conclude by discussing its societal impact. In particular, we highlight how open-sourcing models and tools can lower barriers for researchers and communities, inviting new forms of participation. Open-source artifacts are available at https://github.com/facebookresearch/omnilingual-asr.","short_abstract":"Automatic speech recognition (ASR) has advanced in high-resource languages, but most of the world's 7,000+ languages remain unsupported, leaving thousands of long-tail languages behind. Expanding ASR coverage has been costly and limited by architectures that restrict language support, making extension inaccessible to m...","url_abs":"https://arxiv.org/abs/2511.09690","url_pdf":"https://arxiv.org/pdf/2511.09690v1","authors":"[\"Omnilingual ASR team\",\"Gil Keren\",\"Artyom Kozhevnikov\",\"Yen Meng\",\"Christophe Ropers\",\"Matthew Setzler\",\"Skyler Wang\",\"Ife Adebara\",\"Michael Auli\",\"Can Balioglu\",\"Kevin Chan\",\"Chierh Cheng\",\"Joe Chuang\",\"Caley Droof\",\"Mark Duppenthaler\",\"Paul-Ambroise Duquenne\",\"Alexander Erben\",\"Cynthia Gao\",\"Gabriel Mejia Gonzalez\",\"Kehan Lyu\",\"Sagar Miglani\",\"Vineel Pratap\",\"Kaushik Ram Sadagopan\",\"Safiyyah Saleem\",\"Arina Turkatenko\",\"Albert Ventayol-Boada\",\"Zheng-Xin Yong\",\"Yu-An Chung\",\"Jean Maillard\",\"Rashel Moritz\",\"Alexandre Mourachko\",\"Mary Williamson\",\"Shireen Yates\"]","published":"2025-11-12T19:48:09Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\"]","has_code":false,"code_links":[{"ID":607172,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2842988,"paper_url":"https://arxiv.org/abs/2511.09690","paper_title":"Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages","repo_url":"https://github.com/facebookresearch/omnilingual-asr","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
