{"ID":2866590,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.20114","arxiv_id":"2509.20114","title":"Beyond Slater's Condition in Online CMDPs with Stochastic and Adversarial Constraints","abstract":"We study \\emph{online episodic Constrained Markov Decision Processes} (CMDPs) under both stochastic and adversarial constraints. We provide a novel algorithm whose guarantees greatly improve those of the state-of-the-art best-of-both-worlds algorithm introduced by Stradi et al. (2025). In the stochastic regime, \\emph{i.e.}, when the constraints are sampled from fixed but unknown distributions, our method achieves $\\widetilde{\\mathcal{O}}(\\sqrt{T})$ regret and constraint violation without relying on Slater's condition, thereby handling settings where no strictly feasible solution exists. Moreover, we provide guarantees on the stronger notion of \\emph{positive} constraint violation, which does not allow to recover from large violation in the early episodes by playing strictly safe policies. In the adversarial regime, \\emph{i.e.}, when the constraints may change arbitrarily between episodes, our algorithm ensures sublinear constraint violation without Slater's condition, and achieves sublinear $α$-regret with respect to the \\emph{unconstrained} optimum, where $α$ is a suitably defined multiplicative approximation factor. We further validate our results through synthetic experiments, showing the practical effectiveness of our algorithm.","short_abstract":"We study \\emph{online episodic Constrained Markov Decision Processes} (CMDPs) under both stochastic and adversarial constraints. We provide a novel algorithm whose guarantees greatly improve those of the state-of-the-art best-of-both-worlds algorithm introduced by Stradi et al. (2025). In the stochastic regime, \\emph{i...","url_abs":"https://arxiv.org/abs/2509.20114","url_pdf":"https://arxiv.org/pdf/2509.20114v2","authors":"[\"Francesco Emanuele Stradi\",\"Eleonora Fidelia Chiefari\",\"Matteo Castiglioni\",\"Alberto Marchesi\",\"Nicola Gatti\"]","published":"2025-09-24T13:38:32Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}
