{"ID":2859435,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.06072","arxiv_id":"2510.06072","title":"EmoHRNet: High-Resolution Neural Network Based Speech Emotion Recognition","abstract":"Speech emotion recognition (SER) is pivotal for enhancing human-machine interactions. This paper introduces \"EmoHRNet\", a novel adaptation of High-Resolution Networks (HRNet) tailored for SER. The HRNet structure is designed to maintain high-resolution representations from the initial to the final layers. By transforming audio samples into spectrograms, EmoHRNet leverages the HRNet architecture to extract high-level features. EmoHRNet's unique architecture maintains high-resolution representations throughout, capturing both granular and overarching emotional cues from speech signals. The model outperforms leading models, achieving accuracies of 92.45% on RAVDESS, 80.06% on IEMOCAP, and 92.77% on EMOVO. Thus, we show that EmoHRNet sets a new benchmark in the SER domain.","short_abstract":"Speech emotion recognition (SER) is pivotal for enhancing human-machine interactions. This paper introduces \"EmoHRNet\", a novel adaptation of High-Resolution Networks (HRNet) tailored for SER. The HRNet structure is designed to maintain high-resolution representations from the initial to the final layers. By transformi...","url_abs":"https://arxiv.org/abs/2510.06072","url_pdf":"https://arxiv.org/pdf/2510.06072v1","authors":"[\"Akshay Muppidi\",\"Martin Radfar\"]","published":"2025-10-07T15:59:40Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.LG\"]","methods":"[]","has_code":false}
