{"ID":2856859,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10730","arxiv_id":"2510.10730","title":"Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits","abstract":"We provide a unified algorithmic framework for ensemble sampling in nonlinear contextual bandits and develop corresponding regret bounds for two most common nonlinear contextual bandit settings: Generalized Linear Ensemble Sampling (GLM-ES) for generalized linear bandits and Neural Ensemble Sampling (Neural-ES) for neural contextual bandits. Both methods maintain multiple estimators for the reward model parameters via maximum likelihood estimation on randomly perturbed data. We prove high-probability frequentist regret bounds of $\\widetilde{O}(d^{3/2} \\sqrt{T} + d^{4})$ for GLM-ES and $\\widetilde{O}(\\widetilde{d}^{3/2} \\sqrt{T})$ for Neural-ES, where $d$ is the dimension of feature vectors, $\\widetilde{d}$ is the effective dimension of a neural tangent kernel (NTK) matrix and $T$ is the number of rounds. The regret bound of GLM-ES matches the state-of-the-art result of randomized exploration algorithms in generalized linear bandit setting. In the theoretical analysis, we introduce techniques that address challenges specific to nonlinear models. Practically, we remove fixed-time horizon assumption by developing anytime versions of our algorithms, suitable when $T$ is unknown. Finally, we empirically evaluate GLM-ES, Neural-ES and their anytime variants, demonstrating strong performance. Overall, our results establish ensemble sampling as a provable and practical randomized exploration approach for nonlinear contextual bandits.","short_abstract":"We provide a unified algorithmic framework for ensemble sampling in nonlinear contextual bandits and develop corresponding regret bounds for two most common nonlinear contextual bandit settings: Generalized Linear Ensemble Sampling (GLM-ES) for generalized linear bandits and Neural Ensemble Sampling (Neural-ES) for neu...","url_abs":"https://arxiv.org/abs/2510.10730","url_pdf":"https://arxiv.org/pdf/2510.10730v2","authors":"[\"Jiazheng Sun\",\"Weixin Wang\",\"Pan Xu\"]","published":"2025-10-12T18:05:53Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"stat.ML\"]","methods":"[\"LoRA\"]","has_code":false}
