{"ID":2852293,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.18680","arxiv_id":"2510.18680","title":"Learning Task-Agnostic Representations through Multi-Teacher Distillation","abstract":"Casting complex inputs into tractable representations is a critical step across various fields. Diverse embedding models emerge from differences in architectures, loss functions, input modalities and datasets, each capturing unique aspects of the input. Multi-teacher distillation leverages this diversity to enrich representations but often remains tailored to specific tasks. In this paper, we introduce a task-agnostic framework based on a ``majority vote\" objective function. We demonstrate that this function is bounded by the mutual information between student and teachers' embeddings, leading to a task-agnostic distillation loss that eliminates dependence on task-specific labels or prior knowledge. Our evaluations across text, vision models, and molecular modeling show that our method effectively leverages teacher diversity, resulting in representations enabling better performance for a wide range of downstream tasks such as classification, clustering, or regression. Additionally, we train and release state-of-the-art embedding models, enhancing downstream performance in various modalities.","short_abstract":"Casting complex inputs into tractable representations is a critical step across various fields. Diverse embedding models emerge from differences in architectures, loss functions, input modalities and datasets, each capturing unique aspects of the input. Multi-teacher distillation leverages this diversity to enrich repr...","url_abs":"https://arxiv.org/abs/2510.18680","url_pdf":"https://arxiv.org/pdf/2510.18680v1","authors":"[\"Philippe Formont\",\"Maxime Darrin\",\"Banafsheh Karimian\",\"Jackie CK Cheung\",\"Eric Granger\",\"Ismail Ben Ayed\",\"Mohammadhadi Shateri\",\"Pablo Piantanida\"]","published":"2025-10-21T14:36:33Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}