{"ID":2850291,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.22267","arxiv_id":"2510.22267","title":"Actor-Critic Learning for Risk-Constrained Linear Quadratic Regulation","abstract":"In this paper, we investigate the infinite-horizon risk-constrained linear quadratic regulator problem (RC-QR), which augments the classical LQR formulation with a statistical constraint on the variability of the system state to incorporate risk awareness, a key requirement in safety-critical control applications. We propose an actor-critic learning algorithm that jointly performs policy evaluation and policy improvement in a model-free and online manner. The RC-QR problem is first reformulated as a max-min optimization problem, from which we develop a multi-time-scale stochastic approximation scheme. The critic employs temporal-difference learning to estimate the action-value function, the actor updates the policy parameters via a policy gradient step, and the dual variable is adapted through gradient ascent to enforce the risk constraint.","short_abstract":"In this paper, we investigate the infinite-horizon risk-constrained linear quadratic regulator problem (RC-QR), which augments the classical LQR formulation with a statistical constraint on the variability of the system state to incorporate risk awareness, a key requirement in safety-critical control applications. We p...","url_abs":"https://arxiv.org/abs/2510.22267","url_pdf":"https://arxiv.org/pdf/2510.22267v1","authors":"[\"Weijian Li\",\"Andreas A. Malikopoulos\"]","published":"2025-10-25T12:15:54Z","proceeding":"math.OC","tasks":"[\"math.OC\"]","methods":"[]","has_code":false}