{"ID":2827942,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.15229","arxiv_id":"2512.15229","title":"O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization","abstract":"We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a great trade-off between DER and complexity, even when working on independent chunks with no overlap, making the system extremely efficient.","short_abstract":"We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantag...","url_abs":"https://arxiv.org/abs/2512.15229","url_pdf":"https://arxiv.org/pdf/2512.15229v1","authors":"[\"Elio Gruttadauria\",\"Mathieu Fontaine\",\"Jonathan Le Roux\",\"Slim Essid\"]","published":"2025-12-17T09:27:23Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.SD\",\"eess.SP\"]","methods":"[]","has_code":false}
