{"ID":2841728,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.11308","arxiv_id":"2511.11308","title":"Policy Optimization for Unknown Systems using Differentiable Model Predictive Control","abstract":"Model-based policy optimization often struggles with inaccurate system dynamics models, leading to suboptimal closed-loop performance. This challenge is especially evident in Model Predictive Control (MPC) policies, which rely on the model for real-time trajectory planning and optimization. We introduce a novel policy optimization framework for MPC-based policies combining differentiable optimization with zeroth-order optimization. Our method combines model-based and model-free gradient estimation approaches, achieving faster transient performance compared to fully data-driven approaches while maintaining convergence guarantees, even under model uncertainty. We demonstrate the effectiveness of the proposed approach on a nonlinear control task involving a 12-dimensional quadcopter model.","short_abstract":"Model-based policy optimization often struggles with inaccurate system dynamics models, leading to suboptimal closed-loop performance. This challenge is especially evident in Model Predictive Control (MPC) policies, which rely on the model for real-time trajectory planning and optimization. We introduce a novel policy...","url_abs":"https://arxiv.org/abs/2511.11308","url_pdf":"https://arxiv.org/pdf/2511.11308v2","authors":"[\"Riccardo Zuliani\",\"Efe C. Balta\",\"John Lygeros\"]","published":"2025-11-14T13:51:33Z","proceeding":"eess.SY","tasks":"[\"eess.SY\",\"math.OC\"]","methods":"[]","has_code":false}