An Agent-Centric Dynamical Systems Perspective on Multi-Agent Reinforcement Learning

cs.MA arXiv:2512.07588
View PDF arXiv JSON

Abstract

Analysing learning in Multi-Agent Reinforcement Learning (MARL) environments is challenging, in particular with respect to \textit{individual} decision-making. Practitioners frequently struggle to compare training runs due to the inherent stochasticity in algorithms arising from random dithering exploration, environment transition noise, and stochastic gradient updates to name a few. Traditional analytical approaches, such as replicator dynamics, oft rely on mean-field approximations to remove stochastic effects, but this simplification, whilst able to provide general overall trends, can lead to dissonance between analytical predictions and actual agent realisations. We propose modelling MARL training as a \textit{coupled stochastic dynamical systems}, capturing both agent interactions and environmental characteristics. Leveraging tools from dynamical systems theory, we pragmatically analyse the stability and sensitivity of agent behaviour, which are key dimensions for their practical deployments, for example, in presence of strict safety requirements. This framework allows us to rigorously study the inherent stochasticity of MARL, providing a deeper understanding of system behaviour.

PDF Viewer