{"ID":2848829,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.00039","arxiv_id":"2511.00039","title":"Graph-Attentive MAPPO for Dynamic Retail Pricing","abstract":"Dynamic pricing in retail requires policies that adapt to shifting demand while coordinating decisions across related products. We present a systematic empirical study of multi-agent reinforcement learning for retail price optimization, comparing a strong MAPPO baseline with a graph-attention-augmented variant (MAPPO+GAT) that leverages learned interactions among products. Using a simulated pricing environment derived from real transaction data, we evaluate profit, stability across random seeds, fairness across products, and training efficiency under a standardized evaluation protocol. The results indicate that MAPPO provides a robust and reproducible foundation for portfolio-level price control, and that MAPPO+GAT further enhances performance by sharing information over the product graph without inducing excessive price volatility. These results indicate that graph-integrated MARL provides a more scalable and stable solution than independent learners for dynamic retail pricing, offering practical advantages in multi-product decision-making.","short_abstract":"Dynamic pricing in retail requires policies that adapt to shifting demand while coordinating decisions across related products. We present a systematic empirical study of multi-agent reinforcement learning for retail price optimization, comparing a strong MAPPO baseline with a graph-attention-augmented variant (MAPPO+G...","url_abs":"https://arxiv.org/abs/2511.00039","url_pdf":"https://arxiv.org/pdf/2511.00039v1","authors":"[\"Krishna Kumar Neelakanta Pillai Santha Kumari Amma\"]","published":"2025-10-28T00:15:59Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}