{"ID":2859089,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.05484","arxiv_id":"2510.05484","title":"Evaluating LLM Safety Across Child Development Stages: A Simulated Agent Approach","abstract":"Current safety alignment for Large Language Models (LLMs) implicitly optimizes for a \"modal adult user,\" leaving models vulnerable to distributional shifts in user cognition. We present ChildSafe, a benchmark that quantifies alignment robustness under cognitive shifts corresponding to four developmental stages. Unlike static persona-based evaluations, we introduce a parametric cognitive simulation approach, formalizing developmental stages as hyperparameter constraints (e.g., volatility, context horizon) to generate out-of-distribution interaction traces. We validate these agents against ground-truth human linguistic data (CHILDES) and deploy them across 1,200 multi-turn interactions. Our results reveal a systematic alignment generalization gap: state-of-the-art models exhibit up to 11.5% performance degradation when interacting with early-childhood agents compared to standard baselines. We provide the research community with the validated agent artifacts and evaluation protocols to facilitate robust alignment testing against non-adversarial, cognitively diverse populations.","short_abstract":"Current safety alignment for Large Language Models (LLMs) implicitly optimizes for a \"modal adult user,\" leaving models vulnerable to distributional shifts in user cognition. We present ChildSafe, a benchmark that quantifies alignment robustness under cognitive shifts corresponding to four developmental stages. Unlike...","url_abs":"https://arxiv.org/abs/2510.05484","url_pdf":"https://arxiv.org/pdf/2510.05484v2","authors":"[\"Abhejay Murali\",\"Saleh Afroogh\",\"Kevin Chen\",\"David Atkinson\",\"Amit Dhurandhar\",\"Junfeng Jiao\"]","published":"2025-10-07T01:01:04Z","proceeding":"cs.CY","tasks":"[\"cs.CY\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
