{"ID":571995,"CreatedAt":"2026-03-04T20:59:09Z","UpdatedAt":"2026-03-04T20:59:09Z","DeletedAt":null,"paper_url":"https://paperswithcode.com/paper/bge-m3-embedding-multi-lingual-multi","arxiv_id":"2402.03216","title":"BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation","abstract":"In this paper, we present a new embedding model, called M3-Embedding, which is distinguished for its versatility in Multi-Linguality, Multi-Functionality, and Multi-Granularity. It can support more than 100 working languages, leading to new state-of-the-art performances on multi-lingual and cross-lingual retrieval tasks. It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval, which provides a unified model foundation for real-world IR applications. It is able to process inputs of different granularities, spanning from short sentences to long documents of up to 8192 tokens. The effective training of M3-Embedding involves the following technical contributions. We propose a novel self-knowledge distillation approach, where the relevance scores from different retrieval functionalities can be integrated as the teacher signal to enhance the training quality. We also optimize the batching strategy, enabling a large batch size and high training throughput to ensure the discriminativeness of embeddings. To the best of our knowledge, M3-Embedding is the first embedding model which realizes such a strong versatility. The model and code will be publicly available at https://github.com/FlagOpen/FlagEmbedding.","short_abstract":"It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval, which provides a unified model foundation for real-world IR applications.","url_abs":"https://arxiv.org/abs/2402.03216v4","url_pdf":"https://arxiv.org/pdf/2402.03216v4.pdf","authors":"[\"Jianlv Chen\", \"Shitao Xiao\", \"Peitian Zhang\", \"Kun Luo\", \"Defu Lian\", \"Zheng Liu\"]","published":"2024-02-05T00:00:00Z","tasks":"[\"Knowledge Distillation\", \"Retrieval\", \"Self-Knowledge Distillation\"]","methods":"[]","has_code":false,"code_links":[{"ID":302953,"CreatedAt":"2026-03-04T21:00:12Z","UpdatedAt":"2026-03-04T21:00:12Z","DeletedAt":null,"paper_id":571995,"paper_url":"https://paperswithcode.com/paper/bge-m3-embedding-multi-lingual-multi","paper_title":"BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation","repo_url":"https://github.com/flagopen/flagembedding","is_official":true,"mentioned_in_paper":true,"mentioned_in_github":true,"framework":"pytorch","github_stars":0},{"ID":364557,"CreatedAt":"2026-03-04T21:00:12Z","UpdatedAt":"2026-03-04T21:00:12Z","DeletedAt":null,"paper_id":571995,"paper_url":"https://paperswithcode.com/paper/bge-m3-embedding-multi-lingual-multi","paper_title":"BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation","repo_url":"https://github.com/2024-MindSpore-1/Code2/tree/main/model-1/bge_m3","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":false,"framework":"mindspore","github_stars":0},{"ID":371693,"CreatedAt":"2026-03-04T21:00:12Z","UpdatedAt":"2026-03-04T21:00:12Z","DeletedAt":null,"paper_id":571995,"paper_url":"https://paperswithcode.com/paper/bge-m3-embedding-multi-lingual-multi","paper_title":"BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation","repo_url":"https://github.com/allen-li1231/treehop-rag","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"framework":"pytorch","github_stars":0},{"ID":493728,"CreatedAt":"2026-03-04T21:00:12Z","UpdatedAt":"2026-03-04T21:00:12Z","DeletedAt":null,"paper_id":571995,"paper_url":"https://paperswithcode.com/paper/bge-m3-embedding-multi-lingual-multi","paper_title":"BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation","repo_url":"https://github.com/terrierteam/pyterrier_dr","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"framework":"pytorch","github_stars":0}]}