{"ID":2863473,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24541","arxiv_id":"2509.24541","title":"Markov Decision Processing Networks","abstract":"We introduce Markov Decision Processing Networks (MDPNs) as a multiclass queueing network model where service is a controlled, finite-state Markov process. The model exhibits a decision-dependent service process where actions taken influence future service availability. Viewed as a two-sided queueing model, this captures settings such as assemble-to-order systems, ride-hailing platforms, cross-skilled call centers, and quantum switches. We first characterize the capacity region of MDPNs. Unlike classical switched networks, the MDPN capacity region depends on the long-run mix of service states induced by the control of the underlying service process. We show, via a counterexample, that MaxWeight is not throughput-optimal in this class, demonstrating the distinction between MDPNs and classical queueing models. To bridge this gap, we design a weighted average reward policy, a multiobjective MDP that leverages a two-timescale separation at the fluid scale. We prove throughput-optimality of the resulting policy. The techniques yield a clear capacity region description and apply to a broad family of two-sided matching systems.","short_abstract":"We introduce Markov Decision Processing Networks (MDPNs) as a multiclass queueing network model where service is a controlled, finite-state Markov process. The model exhibits a decision-dependent service process where actions taken influence future service availability. Viewed as a two-sided queueing model, this captur...","url_abs":"https://arxiv.org/abs/2509.24541","url_pdf":"https://arxiv.org/pdf/2509.24541v1","authors":"[\"Sanidhay Bhambay\",\"Thirupathaiah Vasantam\",\"Neil Walton\"]","published":"2025-09-29T09:54:34Z","proceeding":"math.OC","tasks":"[\"math.OC\",\"cs.NI\",\"math.PR\"]","methods":"[]","has_code":false}
