PG-Flow: Deterministic implicit policy gradients for geometric product-form queueing networks
Abstract
Product-form queueing networks (PFQNs) admit steady-state distributions that factorize into local terms, and in many classical PFQNs including Jackson, BCMP, G-networks, and Energy Packet Networks, these marginals are geometric and parametrized by local flow variables satisfying balance equations. While this structure yields closed-form expressions for key performance metrics, its use for deterministic steady-state optimization remains limited. We introduce PG-Flow, a deterministic policy-gradient framework that differentiates through the steady-state flow fixed-point equations, providing exact gradients via implicit differentiation and a local adjoint system while avoiding trajectory sampling and Poisson equations. We establish global convergence under structural assumptions (affine flow operators and convex local costs), and show that acyclic networks admit linear-time computation of both flows and gradients. Numerical experiments on routing control in Jackson networks and energy-arrival control in Energy Packet Networks demonstrate that PG-Flow provides a principled and computationally efficient approach to deterministic steady-state optimization in geometric product-form networks.