{"ID":2896071,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.07640","arxiv_id":"2507.07640","title":"Lost in Pronunciation: Detecting Chinese Offensive Language Disguised by Phonetic Cloaking Replacement","abstract":"Phonetic Cloaking Replacement (PCR), defined as the deliberate use of homophonic or near-homophonic variants to hide toxic intent, has become a major obstacle to Chinese content moderation. While this problem is well-recognized, existing evaluations predominantly rely on rule-based, synthetic perturbations that ignore the creativity of real users. We organize PCR into a four-way surface-form taxonomy and compile \\ours, a dataset of 500 naturally occurring, phonetically cloaked offensive posts gathered from the RedNote platform. Benchmarking state-of-the-art LLMs on this dataset exposes a serious weakness: the best model reaches only an F1-score of 0.672, and zero-shot chain-of-thought prompting pushes performance even lower. Guided by error analysis, we revisit a Pinyin-based prompting strategy that earlier studies judged ineffective and show that it recovers much of the lost accuracy. This study offers the first comprehensive taxonomy of Chinese PCR, a realistic benchmark that reveals current detectors' limits, and a lightweight mitigation technique that advances research on robust toxicity detection.","short_abstract":"Phonetic Cloaking Replacement (PCR), defined as the deliberate use of homophonic or near-homophonic variants to hide toxic intent, has become a major obstacle to Chinese content moderation. While this problem is well-recognized, existing evaluations predominantly rely on rule-based, synthetic perturbations that ignore...","url_abs":"https://arxiv.org/abs/2507.07640","url_pdf":"https://arxiv.org/pdf/2507.07640v1","authors":"[\"Haotan Guo\",\"Jianfei He\",\"Jiayuan Ma\",\"Hongbin Na\",\"Zimu Wang\",\"Haiyang Zhang\",\"Qi Chen\",\"Wei Wang\",\"Zijing Shi\",\"Tao Shen\",\"Ling Chen\"]","published":"2025-07-10T11:09:26Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Generative Adversarial Network\"]","has_code":false}
