{"ID":2846122,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.02366","arxiv_id":"2511.02366","title":"LiveSecBench: A Dynamic and Event-Driven Safety Benchmark for Chinese Language Model Applications","abstract":"We introduce LiveSecBench, a continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench constructs a high-quality and unique dataset through a pipeline that combines automated generation with human verification. By periodically releasing new versions to expand the dataset and update evaluation metrics, LiveSecBench provides a robust and up-to-date standard for AI safety. In this report, we introduce our second release v251215, which evaluates across five dimensions (Public Safety, Fairness \u0026 Bias, Privacy, Truthfulness, and Mental Health Safety.) We evaluate 57 representative LLMs using an ELO rating system, offering a leaderboard of the current state of Chinese LLM safety. The result is available at https://livesecbench.intokentech.cn/.","short_abstract":"We introduce LiveSecBench, a continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench constructs a high-quality and unique dataset through a pipeline that combines automated generation with human verification. By periodically releasing new versions to expand the da...","url_abs":"https://arxiv.org/abs/2511.02366","url_pdf":"https://arxiv.org/pdf/2511.02366v2","authors":"[\"Yudong Li\",\"Peiru Yang\",\"Feng Huang\",\"Zhongliang Yang\",\"Kecheng Wang\",\"Haitian Li\",\"Baocheng Chen\",\"Xingyu An\",\"Ziyu Liu\",\"Youdan Yang\",\"Kejiang Chen\",\"Sifang Wan\",\"Xu Wang\",\"Yufei Sun\",\"Liyan Wu\",\"Ruiqi Zhou\",\"Wenya Wen\",\"Xingchi Gu\",\"Tianxin Zhang\",\"Yue Gao\",\"Yongfeng Huang\"]","published":"2025-11-04T08:44:09Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","project_urls":"[\"https://livesecbench.intokentech.cn/\"]","has_code":false,"code_links":[{"ID":607409,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2846122,"paper_url":"https://arxiv.org/abs/2511.02366","paper_title":"LiveSecBench: A Dynamic and Event-Driven Safety Benchmark for Chinese Language Model Applications","repo_url":"https://github.com/ydli-ai/LiveSecBench","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
