{"ID":2825345,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.20911","arxiv_id":"2512.20911","title":"Model-free stochastic linear quadratic control for discrete-time systems with multiplicative and additive noises via semidefinite programming","abstract":"This paper investigates a model-free solution to the stochastic linear quadratic regulation (LQR) problem for linear discrete-time systems with both multiplicative and additive noises. We formulate the stochastic LQR problem as a nonconvex optimization problem and rigorously analyze its dual problem structure. By exploiting the inherent convexity of the dual problem and analyzing Karush-Kuhn-Tucker conditions with respect to optimality in convex optimization, we establish an explicit relationship between the optimal point of the dual problem and the parameters of the associated Q-function. This theoretical insight, combined with the technique of the matrix direct sum, makes it possible to develop a novel model-free sample-efficient, non-iterative semidefinite programming algorithm that directly estimates optimal control gain without requiring an initial stabilizing controller, or noises measurability. The robustness of the model-free SDP method to errors is investigated. Our approach provides a new optimization-theoretic framework for understanding Q-learning algorithms while advancing the theoretical foundations of reinforcement learning in stochastic optimal control. Numerical validation on a pulse-width modulated inverter system demonstrates the algorithm's effectiveness, particularly in achieving a single-step non-iterative solution without hyper-parameter tuning.","short_abstract":"This paper investigates a model-free solution to the stochastic linear quadratic regulation (LQR) problem for linear discrete-time systems with both multiplicative and additive noises. We formulate the stochastic LQR problem as a nonconvex optimization problem and rigorously analyze its dual problem structure. By explo...","url_abs":"https://arxiv.org/abs/2512.20911","url_pdf":"https://arxiv.org/pdf/2512.20911v1","authors":"[\"Jing Guo\",\"Xiushan Jiang\",\"Weihai Zhang\"]","published":"2025-12-24T03:31:48Z","proceeding":"math.OC","tasks":"[\"math.OC\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
