{"ID":2868462,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.16722","arxiv_id":"2509.16722","title":"A Multi-Level Benchmark for Causal Language Understanding in Social Media Discourse","abstract":"Understanding causal language in informal discourse is a core yet underexplored challenge in NLP. Existing datasets largely focus on explicit causality in structured text, providing limited support for detecting implicit causal expressions, particularly those found in informal, user-generated social media posts. We introduce CausalTalk, a multi-level dataset of five years of Reddit posts (2020-2024) discussing public health related to the COVID-19 pandemic, among which 10120 posts are annotated across four causal tasks: (1) binary causal classification, (2) explicit vs. implicit causality, (3) cause-effect span extraction, and (4) causal gist generation. Annotations comprise both gold-standard labels created by domain experts and silver-standard labels generated by GPT-4o and verified by human annotators. CausalTalk bridges fine-grained causal detection and gist-based reasoning over informal text. It enables benchmarking across both discriminative and generative models, and provides a rich resource for studying causal reasoning in social media contexts.","short_abstract":"Understanding causal language in informal discourse is a core yet underexplored challenge in NLP. Existing datasets largely focus on explicit causality in structured text, providing limited support for detecting implicit causal expressions, particularly those found in informal, user-generated social media posts. We int...","url_abs":"https://arxiv.org/abs/2509.16722","url_pdf":"https://arxiv.org/pdf/2509.16722v1","authors":"[\"Xiaohan Ding\",\"Kaike Ping\",\"Buse Çarık\",\"Eugenia Rho\"]","published":"2025-09-20T15:20:33Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[]","has_code":false}