{"ID":2837173,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.03059","arxiv_id":"2512.03059","title":"Safe and Sustainable Electric Bus Charging Scheduling with Constrained Hierarchical DRL","abstract":"The integration of Electric Buses (EBs) with renewable energy sources such as photovoltaic (PV) panels is a promising approach to promote sustainable and low-carbon public transportation. However, optimizing EB charging schedules to minimize operational costs while ensuring safe operation without battery depletion remains challenging - especially under real-world conditions, where uncertainties in PV generation, dynamic electricity prices, variable travel times, and limited charging infrastructure must be accounted for. In this paper, we propose a safe Hierarchical Deep Reinforcement Learning (HDRL) framework for solving the EB Charging Scheduling Problem (EBCSP) under multi-source uncertainties. We formulate the problem as a Constrained Markov Decision Process (CMDP) with options to enable temporally abstract decision-making. We develop a novel HDRL algorithm, namely Double Actor-Critic Multi-Agent Proximal Policy Optimization Lagrangian (DAC-MAPPO-Lagrangian), which integrates Lagrangian relaxation into the Double Actor-Critic (DAC) framework. At the high level, we adopt a centralized PPO-Lagrangian algorithm to learn safe charger allocation policies. At the low level, we incorporate MAPPO-Lagrangian to learn decentralized charging power decisions under the Centralized Training and Decentralized Execution (CTDE) paradigm. Extensive experiments with real-world data demonstrate that the proposed approach outperforms existing baselines in both cost minimization and safety compliance, while maintaining fast convergence speed.","short_abstract":"The integration of Electric Buses (EBs) with renewable energy sources such as photovoltaic (PV) panels is a promising approach to promote sustainable and low-carbon public transportation. However, optimizing EB charging schedules to minimize operational costs while ensuring safe operation without battery depletion rema...","url_abs":"https://arxiv.org/abs/2512.03059","url_pdf":"https://arxiv.org/pdf/2512.03059v1","authors":"[\"Jiaju Qi\",\"Lei Lei\",\"Thorsteinn Jonsson\",\"Dusit Niyato\"]","published":"2025-11-25T20:00:02Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
