{"ID":2823776,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.24955","arxiv_id":"2512.24955","title":"MSACL: Multi-Step Actor-Critic Learning with Lyapunov Certificates for Exponentially Stabilizing Control","abstract":"For stabilizing control tasks, model-free reinforcement learning (RL) approaches face numerous challenges, particularly regarding the issues of effectiveness and efficiency in complex high-dimensional environments with limited training data. To address these challenges, we propose Multi-Step Actor-Critic Learning with Lyapunov Certificates (MSACL), a novel approach that integrates exponential stability into off-policy maximum entropy reinforcement learning (MERL). In contrast to existing RL-based approaches that depend on elaborate reward engineering and single-step constraints, MSACL adopts intuitive reward design and exploits multi-step samples to enable exploratory actor-critic learning. Specifically, we first introduce Exponential Stability Labels (ESLs) to categorize training samples and propose a $λ$-weighted aggregation mechanism to learn Lyapunov certificates. Based on these certificates, we further design a stability-aware advantage function to guide policy optimization, thereby promoting rapid Lyapunov descent and robust state convergence. We evaluate MSACL across six benchmarks, comprising four stabilizing and two high-dimensional tracking tasks. Experimental results demonstrate its consistent performance improvements over both standard RL baselines and state-of-the-art Lyapunov-based RL algorithms. Beyond rapid convergence, MSACL exhibits robustness against environmental uncertainties and generalization to unseen reference signals. The source code and benchmarking environments are available at \\href{https://github.com/YuanZhe-Xing/MSACL}{https://github.com/YuanZhe-Xing/MSACL}.","short_abstract":"For stabilizing control tasks, model-free reinforcement learning (RL) approaches face numerous challenges, particularly regarding the issues of effectiveness and efficiency in complex high-dimensional environments with limited training data. To address these challenges, we propose Multi-Step Actor-Critic Learning with...","url_abs":"https://arxiv.org/abs/2512.24955","url_pdf":"https://arxiv.org/pdf/2512.24955v3","authors":"[\"Yongwei Zhang\",\"Yuanzhe Xing\",\"Quanyi Liang\",\"Quan Quan\",\"Zhikun She\"]","published":"2025-12-31T16:36:44Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.RO\",\"eess.SY\"]","methods":"[\"Reinforcement Learning\",\"LoRA\"]","has_code":false,"code_links":[{"ID":605526,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2823776,"paper_url":"https://arxiv.org/abs/2512.24955","paper_title":"MSACL: Multi-Step Actor-Critic Learning with Lyapunov Certificates for Exponentially Stabilizing Control","repo_url":"https://github.com/YuanZhe-Xing/MSACL","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}