{"ID":2842777,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.09219","arxiv_id":"2511.09219","title":"Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization","abstract":"Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B\u0026B). A key driver influencing B\u0026B solvers efficiency is the variable selection heuristic that guides branching decisions. Looking to move beyond static, hand-crafted heuristics, recent work has explored adapting traditional reinforcement learning (RL) algorithms to the B\u0026B setting, aiming to learn branching strategies tailored to specific MILP distributions. In parallel, RL agents have achieved remarkable success in board games, a very specific type of combinatorial problems, by leveraging environment simulators to plan via Monte Carlo Tree Search (MCTS). Building on these developments, we introduce Plan-and-Branch-and-Bound (PlanB\u0026B), a model-based reinforcement learning (MBRL) agent that leverages a learned internal model of the B\u0026B dynamics to discover improved branching strategies. Computational experiments empirically validate our approach, with our MBRL branching agent outperforming previous state-of-the-art RL methods across four standard MILP benchmarks.","short_abstract":"Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B\u0026B). A key driver influencing B\u0026B solvers efficiency is the variable selection heuristic that guides branching decisions. Looking to move beyond static, hand-c...","url_abs":"https://arxiv.org/abs/2511.09219","url_pdf":"https://arxiv.org/pdf/2511.09219v4","authors":"[\"Paul Strang\",\"Zacharie Alès\",\"Côme Bissuel\",\"Olivier Juan\",\"Safia Kedad-Sidhoum\",\"Emmanuel Rachelson\"]","published":"2025-11-12T11:28:08Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}