{"ID":2869607,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.15476","arxiv_id":"2509.15476","title":"Evaluating Multimodal Large Language Models on Spoken Sarcasm Understanding","abstract":"Sarcasm detection remains a challenge in natural language understanding, as sarcastic intent often relies on subtle cross-modal cues spanning text, speech, and vision. While prior work has primarily focused on textual or visual-textual sarcasm, comprehensive audio-visual-textual sarcasm understanding remains underexplored. In this paper, we systematically evaluate large language models (LLMs) and multimodal LLMs for sarcasm detection on English (MUStARD++) and Chinese (MCSD 1.0) in zero-shot, few-shot, and LoRA fine-tuning settings. In addition to direct classification, we explore models as feature encoders, integrating their representations through a collaborative gating fusion module. Experimental results show that audio-based models achieve the strongest unimodal performance, while text-audio and audio-vision combinations outperform unimodal and trimodal models. Furthermore, MLLMs such as Qwen-Omni show competitive zero-shot and fine-tuned performance. Our findings highlight the potential of MLLMs for cross-lingual, audio-visual-textual sarcasm understanding.","short_abstract":"Sarcasm detection remains a challenge in natural language understanding, as sarcastic intent often relies on subtle cross-modal cues spanning text, speech, and vision. While prior work has primarily focused on textual or visual-textual sarcasm, comprehensive audio-visual-textual sarcasm understanding remains underexplo...","url_abs":"https://arxiv.org/abs/2509.15476","url_pdf":"https://arxiv.org/pdf/2509.15476v1","authors":"[\"Zhu Li\",\"Xiyuan Gao\",\"Yuqing Zhang\",\"Shekhar Nayak\",\"Matt Coler\"]","published":"2025-09-18T22:44:27Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.MM\"]","methods":"[\"Large Language Model\",\"Language Model\",\"LoRA\"]","has_code":false}
