{"ID":2859289,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.05808","arxiv_id":"2510.05808","title":"Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making","abstract":"Minimax risk and regret focus on expectation, missing rare failures critical in safety-critical bandits and reinforcement learning. Minimax quantiles capture these tails. Three strands of prior work motivate this study: minimax-quantile bounds restricted to non-interactive estimation; unified interactive analyses that focus on expected risk rather than risk level specific quantile bounds; and high-probability bandit bounds that still lack a quantile-specific toolkit for general interactive protocols. To close this gap, within the interactive statistical decision making framework, we develop high-probability Fano and Le Cam tools and derive risk level explicit minimax-quantile bounds, including a quantile-to-expectation conversion and a tight link between strict and lower minimax quantiles. Instantiating these results for the two-armed Gaussian bandit immediately recovers optimal-rate bounds.","short_abstract":"Minimax risk and regret focus on expectation, missing rare failures critical in safety-critical bandits and reinforcement learning. Minimax quantiles capture these tails. Three strands of prior work motivate this study: minimax-quantile bounds restricted to non-interactive estimation; unified interactive analyses that...","url_abs":"https://arxiv.org/abs/2510.05808","url_pdf":"https://arxiv.org/pdf/2510.05808v1","authors":"[\"Raghav Bongole\",\"Amirreza Zamani\",\"Tobias J. Oechtering\",\"Mikael Skoglund\"]","published":"2025-10-07T11:25:13Z","proceeding":"cs.IT","tasks":"[\"cs.IT\",\"cs.AI\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
