{"ID":2822783,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.01350","arxiv_id":"2601.01350","title":"FC-CONAN: An Exhaustively Paired Dataset for Robust Evaluation of Retrieval Systems","abstract":"Hate speech (HS) is a critical issue in online discourse, and one promising strategy to counter it is through the use of counter-narratives (CNs). Datasets linking HS with CNs are essential for advancing counterspeech research. However, even flagship resources like CONAN (Chung et al., 2019) annotate only a sparse subset of all possible HS-CN pairs, limiting evaluation. We introduce FC-CONAN (Fully Connected CONAN), the first dataset created by exhaustively considering all combinations of 45 English HS messages and 129 CNs. A two-stage annotation process involving nine annotators and four validators produces four partitions-Diamond, Gold, Silver, and Bronze-that balance reliability and scale. None of the labeled pairs overlap with CONAN, uncovering hundreds of previously unlabelled positives. FC-CONAN enables more faithful evaluation of counterspeech retrieval systems and facilitates detailed error analysis. The dataset is publicly available.","short_abstract":"Hate speech (HS) is a critical issue in online discourse, and one promising strategy to counter it is through the use of counter-narratives (CNs). Datasets linking HS with CNs are essential for advancing counterspeech research. However, even flagship resources like CONAN (Chung et al., 2019) annotate only a sparse subs...","url_abs":"https://arxiv.org/abs/2601.01350","url_pdf":"https://arxiv.org/pdf/2601.01350v1","authors":"[\"Juan Junqueras\",\"Florian Boudin\",\"May-Myo Zin\",\"Ha-Thanh Nguyen\",\"Wachara Fungwacharakorn\",\"Damián Ariel Furman\",\"Akiko Aizawa\",\"Ken Satoh\"]","published":"2026-01-04T03:38:46Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[]","has_code":false}
