{"ID":2869427,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.15090","arxiv_id":"2509.15090","title":"Emergent Alignment via Competition","abstract":"Aligning AI systems with human values remains a fundamental challenge, but does our inability to create perfectly aligned models preclude obtaining the benefits of alignment? We study a strategic setting where a human user interacts with multiple differently misaligned AI agents, none of which are individually well-aligned. Our key insight is that when the users utility lies approximately within the convex hull of the agents utilities, a condition that becomes easier to satisfy as model diversity increases, strategic competition can yield outcomes comparable to interacting with a perfectly aligned model. We model this as a multi-leader Stackelberg game, extending Bayesian persuasion to multi-round conversations between differently informed parties, and prove three results: (1) when perfect alignment would allow the user to learn her Bayes-optimal action, she can also do so in all equilibria under the convex hull condition (2) under weaker assumptions requiring only approximate utility learning, a non-strategic user employing quantal response achieves near-optimal utility in all equilibria and (3) when the user selects the best single AI after an evaluation period, equilibrium guarantees remain near-optimal without further distributional assumptions. We complement the theory with two sets of experiments.","short_abstract":"Aligning AI systems with human values remains a fundamental challenge, but does our inability to create perfectly aligned models preclude obtaining the benefits of alignment? We study a strategic setting where a human user interacts with multiple differently misaligned AI agents, none of which are individually well-ali...","url_abs":"https://arxiv.org/abs/2509.15090","url_pdf":"https://arxiv.org/pdf/2509.15090v2","authors":"[\"Natalie Collina\",\"Surbhi Goel\",\"Aaron Roth\",\"Emily Ryu\",\"Mirah Shi\"]","published":"2025-09-18T15:47:00Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.GT\",\"econ.TH\"]","methods":"[]","has_code":false}
