{"ID":2847310,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.00637","arxiv_id":"2511.00637","title":"Stochastic Shortest Path with Sparse Adversarial Costs","abstract":"We study the adversarial Stochastic Shortest Path (SSP) problem with sparse costs under full-information feedback. In the known transition setting, existing bounds based on Online Mirror Descent (OMD) with negative-entropy regularization scale with $\\sqrt{\\log S A}$, where $SA$ is the size of the state-action space. While we show that this is optimal in the worst-case, this bound fails to capture the benefits of sparsity when only a small number $M \\ll SA$ of state-action pairs incur cost. In fact, we also show that the negative-entropy is inherently non-adaptive to sparsity: it provably incurs regret scaling with $\\sqrt{\\log S}$ on sparse problems. Instead, we propose a family of $\\ell_r$-norm regularizers ($r \\in (1,2)$) that adapts to the sparsity and achieves regret scaling with $\\sqrt{\\log M}$ instead of $\\sqrt{\\log SA}$. We show this is optimal via a matching lower bound, highlighting that $M$ captures the effective dimension of the problem instead of $SA$. Finally, in the unknown transition setting the benefits of sparsity are limited: we prove that even on sparse problems, the minimax regret for any learner scales polynomially with $SA$.","short_abstract":"We study the adversarial Stochastic Shortest Path (SSP) problem with sparse costs under full-information feedback. In the known transition setting, existing bounds based on Online Mirror Descent (OMD) with negative-entropy regularization scale with $\\sqrt{\\log S A}$, where $SA$ is the size of the state-action space. Wh...","url_abs":"https://arxiv.org/abs/2511.00637","url_pdf":"https://arxiv.org/pdf/2511.00637v1","authors":"[\"Emmeran Johnson\",\"Alberto Rumi\",\"Ciara Pike-Burke\",\"Patrick Rebeschini\"]","published":"2025-11-01T17:34:50Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"stat.ML\"]","methods":"[]","has_code":false}
