{"ID":2843538,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.08487","arxiv_id":"2511.08487","title":"How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity","abstract":"Current safety evaluations for LLM-driven agents primarily focus on atomic harms, failing to address sophisticated threats where malicious intent is concealed or diluted within complex tasks. We address this gap with a two-dimensional analysis of agent safety brittleness under the orthogonal pressures of intent concealment and task complexity. To enable this, we introduce OASIS (Orthogonal Agent Safety Inquiry Suite), a hierarchical benchmark with fine-grained annotations and a high-fidelity simulation sandbox. Our findings reveal two critical phenomena: safety alignment degrades sharply and predictably as intent becomes obscured, and a \"Complexity Paradox\" emerges, where agents seem safer on harder tasks only due to capability limitations. By releasing OASIS and its simulation environment, we provide a principled foundation for probing and strengthening agent safety in these overlooked dimensions.","short_abstract":"Current safety evaluations for LLM-driven agents primarily focus on atomic harms, failing to address sophisticated threats where malicious intent is concealed or diluted within complex tasks. We address this gap with a two-dimensional analysis of agent safety brittleness under the orthogonal pressures of intent conceal...","url_abs":"https://arxiv.org/abs/2511.08487","url_pdf":"https://arxiv.org/pdf/2511.08487v1","authors":"[\"Zihan Ma\",\"Dongsheng Zhu\",\"Shudong Liu\",\"Taolin Zhang\",\"Junnan Liu\",\"Qingqiu Li\",\"Minnan Luo\",\"Songyang Zhang\",\"Kai Chen\"]","published":"2025-11-11T17:27:27Z","proceeding":"cs.MA","tasks":"[\"cs.MA\",\"cs.CL\"]","methods":"[\"Large Language Model\"]","has_code":false}
