{"ID":2859786,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.04641","arxiv_id":"2510.04641","title":"Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study","abstract":"Large-scale web-scraped text corpora used to train general-purpose AI models often contain harmful demographic-targeted social biases, creating a regulatory need for data auditing and developing scalable bias-detection methods. Although prior work has investigated biases in text datasets and related detection methods, these studies remain narrow in scope. They typically focus on a single content type (e.g., hate speech), cover limited demographic axes, overlook biases affecting multiple demographics simultaneously, and analyze limited techniques. Consequently, practitioners lack a holistic understanding of the strengths and limitations of recent large language models (LLMs) for automated bias detection. In this study, we conduct a comprehensive benchmark study on English texts to assess the ability of LLMs in detecting demographic-targeted social biases. To align with regulatory requirements, we frame bias detection as a multi-label task of detecting targeted identities using a demographic-focused taxonomy. We then systematically evaluate models across scales and techniques, including prompting, in-context learning, and fine-tuning. Using twelve datasets spanning diverse content types and demographics, our study demonstrates the promise of fine-tuned smaller models for scalable detection. However, our analyses also expose persistent gaps across demographic axes and multi-demographic targeted biases, underscoring the need for more effective and scalable detection frameworks.","short_abstract":"Large-scale web-scraped text corpora used to train general-purpose AI models often contain harmful demographic-targeted social biases, creating a regulatory need for data auditing and developing scalable bias-detection methods. Although prior work has investigated biases in text datasets and related detection methods,...","url_abs":"https://arxiv.org/abs/2510.04641","url_pdf":"https://arxiv.org/pdf/2510.04641v3","authors":"[\"Ayan Majumdar\",\"Feihao Chen\",\"Jinghui Li\",\"Xiaozhen Wang\"]","published":"2025-10-06T09:45:32Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.CY\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
