{"ID":2852846,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.17662","arxiv_id":"2510.17662","title":"DELULU: Discriminative Embedding Learning Using Latent Units for Speaker-Aware Self-Trained Speech Foundational Model","abstract":"Self-supervised speech models have achieved remarkable success on content-driven tasks, yet they remain limited in capturing speaker-discriminative features critical for verification, diarization, and profiling applications. We introduce \\textsc{DELULU}, a speaker-aware self-trained foundational model that addresses this limitation by incorporating speaker-informed structure into pseudo-label generation. DELULU leverages frame-level embeddings from ReDimNet, a state-of-the-art speaker verification model, to guide k-means clustering during pre-training, introducing a speaker-discriminative inductive bias that aligns representation learning with speaker identity. DELULU significantly outperforms prior SSL models across a range of speaker-centric tasks, achieving up to \\textbf{62\\% relative improvement} in equal error rate (EER) for speaker verification and consistent gains on zero-shot profiling tasks including gender, age, accent, and speaker counting; notably surpassing even its teacher model on zero-shot evaluations. Our findings demonstrate that \\textbf{DELULU is a strong universal encoder for speaker-aware speech processing}, enabling superior performance without task-specific fine-tuning.","short_abstract":"Self-supervised speech models have achieved remarkable success on content-driven tasks, yet they remain limited in capturing speaker-discriminative features critical for verification, diarization, and profiling applications. We introduce \\textsc{DELULU}, a speaker-aware self-trained foundational model that addresses th...","url_abs":"https://arxiv.org/abs/2510.17662","url_pdf":"https://arxiv.org/pdf/2510.17662v2","authors":"[\"Massa Baali\",\"Rita Singh\",\"Bhiksha Raj\"]","published":"2025-10-20T15:35:55Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.CL\"]","methods":"[]","has_code":false}
