{"ID":2884504,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.05938","arxiv_id":"2508.05938","title":"Prosocial Behavior Detection in Player Game Chat: From Aligning Human-AI Definitions to Efficient Annotation at Scale","abstract":"Detecting prosociality in text--communication intended to affirm, support, or improve others' behavior--is a novel and increasingly important challenge for trust and safety systems. Unlike toxic content detection, prosociality lacks well-established definitions and labeled data, requiring new approaches to both annotation and deployment. We present a practical, three-stage pipeline that enables scalable, high-precision prosocial content classification while minimizing human labeling effort and inference costs. First, we identify the best LLM-based labeling strategy using a small seed set of human-labeled examples. We then introduce a human-AI refinement loop, where annotators review high-disagreement cases between GPT-4 and humans to iteratively clarify and expand the task definition-a critical step for emerging annotation tasks like prosociality. This process results in improved label quality and definition alignment. Finally, we synthesize 10k high-quality labels using GPT-4 and train a two-stage inference system: a lightweight classifier handles high-confidence predictions, while only $\\sim$35\\% of ambiguous instances are escalated to GPT-4o. This architecture reduces inference costs by $\\sim$70% while achieving high precision ($\\sim$0.90). Our pipeline demonstrates how targeted human-AI interaction, careful task formulation, and deployment-aware architecture design can unlock scalable solutions for novel responsible AI tasks.","short_abstract":"Detecting prosociality in text--communication intended to affirm, support, or improve others' behavior--is a novel and increasingly important challenge for trust and safety systems. Unlike toxic content detection, prosociality lacks well-established definitions and labeled data, requiring new approaches to both annotat...","url_abs":"https://arxiv.org/abs/2508.05938","url_pdf":"https://arxiv.org/pdf/2508.05938v1","authors":"[\"Rafal Kocielnik\",\"Min Kim\",\"Penphob\",\"Boonyarungsrit\",\"Fereshteh Soltani\",\"Deshawn Sambrano\",\"Animashree Anandkumar\",\"R. Michael Alvarez\"]","published":"2025-08-08T02:04:14Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.CY\"]","methods":"[\"Large Language Model\"]","has_code":false}