{"ID":2860578,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.03798","arxiv_id":"2510.03798","title":"Robust Batched Bandits","abstract":"The batched multi-armed bandit (MAB) problem, in which rewards are collected in batches, is crucial for applications such as clinical trials. Existing research predominantly assumes light-tailed reward distributions, yet many real-world scenarios, including clinical outcomes, exhibit heavy-tailed characteristics. This paper bridges this gap by proposing robust batched bandit algorithms designed for heavy-tailed rewards, within both finite-arm and Lipschitz-continuous settings. We reveal a surprising phenomenon: in the instance-independent regime, as well as in the Lipschitz setting, heavier-tailed rewards necessitate a smaller number of batches to achieve near-optimal regret. In stark contrast, for the instance-dependent setting, the required number of batches to attain near-optimal regret remains invariant with respect to tail heaviness.","short_abstract":"The batched multi-armed bandit (MAB) problem, in which rewards are collected in batches, is crucial for applications such as clinical trials. Existing research predominantly assumes light-tailed reward distributions, yet many real-world scenarios, including clinical outcomes, exhibit heavy-tailed characteristics. This...","url_abs":"https://arxiv.org/abs/2510.03798","url_pdf":"https://arxiv.org/pdf/2510.03798v2","authors":"[\"Yunwen Guo\",\"Yunlun Shu\",\"Gongyi Zhuo\",\"Tianyu Wang\"]","published":"2025-10-04T12:26:32Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"stat.ML\"]","methods":"[]","has_code":false}