Computational Hardness of Static Distributionally Robust Markov Decision Processes

math.OC arXiv:2511.02224
View PDF arXiv JSON

Abstract

We present some hardness results on finding the optimal policy for the static formulation of distributionally robust Markov decision processes. We construct problem instances such that when the considered policy class is Markovian and non-randomized, finding the optimal policy is NP-hard. When the considered policy class is Markovian and randomized, the robust value function possesses sub-optimal strict local minimizers, and finding the optimal policy is also NP-hard. The considered instances involve an ambiguity set with only two transition kernels.

PDF Viewer