{"ID":3005097,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-05T07:50:16.0004273Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.03359","arxiv_id":"2606.03359","title":"Speech Emotion Recognition using Attention-based LSTM-Network with Residual Connection","abstract":"Speech emotion recognition is an important component of modern human-computer interaction systems. However, many state-of-the-art approaches rely on large pretrained models with high computational and memory requirements, limiting their applicability. This paper proposes ResLSTM-SA, a lightweight architecture that integrates residual connections with soft attention within an LSTM-based framework. Evaluated on the RAVDESS dataset under strict speaker-independent partitioning, the proposed model outperforms conventional attention-based LSTM baselines and several previously reported CNN- and hybrid CNN-LSTM architectures in terms of unweighted average recall (UAR). The best-performing variant (ResLSTM-SA-h64) achieves a maximum UAR of 0.6517 with only 46.8k trainable parameters, delivering competitive accuracy with three orders of magnitude fewer parameters than large-scale self-supervised alternatives, thereby enabling efficient deployment on edge devices and real-time voice assistants. The source code is available at https://github.com/Mak-Sim/ResLSTM-SER.","short_abstract":"Speech emotion recognition is an important component of modern human-computer interaction systems. However, many state-of-the-art approaches rely on large pretrained models with high computational and memory requirements, limiting their applicability. This paper proposes ResLSTM-SA, a lightweight architecture that inte...","url_abs":"https://arxiv.org/abs/2606.03359","url_pdf":"https://arxiv.org/pdf/2606.03359v1","authors":"[\"Daniil Krasnoproshin\",\"Maxim Vashkevich\"]","published":"2026-06-02T09:08:59Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.CL\",\"cs.LG\"]","methods":"[\"Convolutional Neural Network\"]","has_code":false,"code_links":[{"ID":612734,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-03T03:09:48.883664427Z","DeletedAt":null,"paper_id":3005097,"paper_url":"https://arxiv.org/abs/2606.03359","paper_title":"Speech Emotion Recognition using Attention-based LSTM-Network with Residual Connection","repo_url":"https://github.com/Mak-Sim/ResLSTM-SER","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
