{"ID":2872766,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.08956","arxiv_id":"2509.08956","title":"Multi-Agent Inverse Reinforcement Learning for Identifying Pareto-Efficient Coordination -- A Distributionally Robust Approach","abstract":"Multi-agent inverse reinforcement learning (IRL) aims to identify Pareto-efficient behavior in a multi-agent system, and reconstruct utility functions of the individual agents. Motivated by the problem of detecting UAV coordination, how can we construct a statistical detector for Pareto-efficient behavior given noisy measurements of the decisions of a multi-agent system? This paper approaches this IRL problem by deriving necessary and sufficient conditions for a dataset of multi-agent system dynamics to be consistent with Pareto-efficient coordination, and providing algorithms for recovering utility functions which are consistent with the system dynamics. We derive an optimal statistical detector for determining Pareto-efficient coordination from noisy system measurements, which minimizes Type-I statistical detection error. Then, we provide a utility estimation algorithm which minimizes the worst-case estimation error over a statistical ambiguity set centered at empirical observations; this min-max solution achieves distributionally robust IRL, which is crucial in adversarial strategic interactions. We illustrate these results in a detailed example for detecting Pareto-efficient coordination among multiple UAVs given noisy measurement recorded at a radar. We then reconstruct the utility functions of the UAVs in a distributionally robust sense.","short_abstract":"Multi-agent inverse reinforcement learning (IRL) aims to identify Pareto-efficient behavior in a multi-agent system, and reconstruct utility functions of the individual agents. Motivated by the problem of detecting UAV coordination, how can we construct a statistical detector for Pareto-efficient behavior given noisy m...","url_abs":"https://arxiv.org/abs/2509.08956","url_pdf":"https://arxiv.org/pdf/2509.08956v1","authors":"[\"Luke Snow\",\"Vikram Krishnamurthy\"]","published":"2025-09-10T19:30:04Z","proceeding":"eess.SY","tasks":"[\"eess.SY\",\"eess.SP\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
