{"ID":2839268,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.16823","arxiv_id":"2511.16823","title":"Monte Carlo Expected Threat (MOCET) Scoring","abstract":"Evaluating and measuring AI Safety Level (ASL) threats are crucial for guiding stakeholders to implement safeguards that keep risks within acceptable limits. ASL-3+ models present a unique risk in their ability to uplift novice non-state actors, especially in the realm of biosecurity. Existing evaluation metrics, such as LAB-Bench, BioLP-bench, and WMDP, can reliably assess model uplift and domain knowledge. However, metrics that better contextualize \"real-world risks\" are needed to inform the safety case for LLMs, along with scalable, open-ended metrics to keep pace with their rapid advancements. To address both gaps, we introduce MOCET, an interpretable and doubly-scalable metric (automatable and open-ended) that can quantify real-world risks.","short_abstract":"Evaluating and measuring AI Safety Level (ASL) threats are crucial for guiding stakeholders to implement safeguards that keep risks within acceptable limits. ASL-3+ models present a unique risk in their ability to uplift novice non-state actors, especially in the realm of biosecurity. Existing evaluation metrics, such...","url_abs":"https://arxiv.org/abs/2511.16823","url_pdf":"https://arxiv.org/pdf/2511.16823v1","authors":"[\"Joseph Kim\",\"Saahith Potluri\"]","published":"2025-11-20T22:06:13Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.HC\"]","methods":"[\"Large Language Model\"]","has_code":false}
