{"ID":2871620,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.10999","arxiv_id":"2509.10999","title":"Real-Time Defense Against Coordinated Cyber-Physical Attacks: A Robust Constrained Reinforcement Learning Approach","abstract":"Modern power systems face increasing vulnerability to sophisticated cyber-physical attacks beyond traditional N-1 contingency frameworks. Existing security paradigms face a critical bottleneck: efficiently identifying worst-case scenarios and rapidly coordinating defensive responses are hindered by intensive computation and time delays, during which cascading failures can propagate. This paper presents a novel tri-level robust constrained reinforcement learning (RCRL) framework for robust power system security. The framework generates diverse system states through AC-OPF formulations, identifies worst-case N-K attack scenarios for each state, and trains policies to mitigate these scenarios across all operating conditions without requiring predefined attack patterns. The framework addresses constraint satisfaction through Beta-blending projection-based feasible action mapping techniques during training and primal-dual augmented Lagrangian optimization for deployment. Once trained, the RCRL policy learns how to control observed cyber-physical attacks in real time. Validation on IEEE benchmark systems demonstrates effectiveness against coordinated N-K attacks, causing widespread cascading failures throughout the network. The learned policy can successfully respond rapidly to recover system-wide constraints back to normal within 0.21 ms inference times, establishing superior resilience for critical infrastructure protection.","short_abstract":"Modern power systems face increasing vulnerability to sophisticated cyber-physical attacks beyond traditional N-1 contingency frameworks. Existing security paradigms face a critical bottleneck: efficiently identifying worst-case scenarios and rapidly coordinating defensive responses are hindered by intensive computatio...","url_abs":"https://arxiv.org/abs/2509.10999","url_pdf":"https://arxiv.org/pdf/2509.10999v2","authors":"[\"Saman Mazaheri Khamaneh\",\"Tong Wu\",\"Wei Sun\",\"Cong Chen\"]","published":"2025-09-13T22:49:39Z","proceeding":"eess.SY","tasks":"[\"eess.SY\",\"eess.SP\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
