{"ID":2838261,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.18022","arxiv_id":"2511.18022","title":"GPU-based Split algorithm for Large-Scale CVRPSD","abstract":"Dynamic programming (DP) is a cornerstone of combinatorial optimization, yet its inherently sequential structure has long limited its scalability in scenario-based stochastic programming (SP). This paper introduces a GPU-accelerated framework that reformulates a broad class of forward DP recursions as batched min-plus matrix-vector products over layered DAGs, collapsing actions into masked state-to-state transitions that map seamlessly to GPU kernels. Using this reformulation, our approach takes advantage of massive parallelism across both scenarios and transitions, enabling the simultaneous evaluation of \\emph{over one million uncertainty realizations} in a single GPU pass -- a scale far beyond the reach of existing methods. We instantiate the framework in two canonical applications: the capacitated vehicle routing problem with stochastic demand and a dynamic stochastic inventory routing problem. In both cases, DP subroutines traditionally considered sequential are redesigned to harness two- or three-dimensional GPU parallelism. Experiments demonstrate near-linear scaling in the number of scenarios and yield one to three orders of magnitude speedups over multithreaded CPU baselines, resulting in tighter SAA estimates and significantly stronger first-stage decisions under fixed time budgets. Beyond these applications, our work establishes a general-purpose recipe for transforming classical DP routines into high-throughput GPU primitives, substantially expanding the computational frontier of stochastic discrete optimization to the million-scenario scale.","short_abstract":"Dynamic programming (DP) is a cornerstone of combinatorial optimization, yet its inherently sequential structure has long limited its scalability in scenario-based stochastic programming (SP). This paper introduces a GPU-accelerated framework that reformulates a broad class of forward DP recursions as batched min-plus...","url_abs":"https://arxiv.org/abs/2511.18022","url_pdf":"https://arxiv.org/pdf/2511.18022v1","authors":"[\"Jingyi Zhao\",\"Linxin Yang\",\"Haohua Zhang\",\"Tian Ding\"]","published":"2025-11-22T11:19:38Z","proceeding":"math.OC","tasks":"[\"math.OC\"]","methods":"[]","has_code":false}