{"ID":2832276,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.06471","arxiv_id":"2512.06471","title":"Why Goal-Conditioned Reinforcement Learning Works: Relation to Dual Control","abstract":"Goal-conditioned reinforcement learning (RL) concerns the problem of training an agent to maximize the probability of reaching target goal states. This paper presents an analysis of the goal-conditioned setting based on optimal control. In particular, we derive an optimality gap between more classical, often quadratic, objectives and the goal-conditioned reward, elucidating the success of goal-conditioned RL and why classical ``dense'' rewards can falter. We then consider the partially observed Markov decision setting and connect state estimation to our probabilistic reward, making the goal-conditioned reward well suited to dual control problems. The advantages of goal-conditioned policies are validated on nonlinear and uncertain environments using both RL and predictive control techniques.","short_abstract":"Goal-conditioned reinforcement learning (RL) concerns the problem of training an agent to maximize the probability of reaching target goal states. This paper presents an analysis of the goal-conditioned setting based on optimal control. In particular, we derive an optimality gap between more classical, often quadratic,...","url_abs":"https://arxiv.org/abs/2512.06471","url_pdf":"https://arxiv.org/pdf/2512.06471v2","authors":"[\"Nathan P. Lawrence\",\"Ali Mesbah\"]","published":"2025-12-06T15:28:35Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
