{"ID":2893735,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.11847","arxiv_id":"2507.11847","title":"Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update","abstract":"We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function, thereby modeling a broad class of reward distributions such as Bernoulli and Poisson. While GLBs are widely applicable to real-world scenarios, their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency. Existing methods typically trade off between two objectives, either incurring high per-round costs for optimal regret guarantees or compromising statistical efficiency to enable constant-time updates. In this paper, we propose a jointly efficient algorithm that attains a nearly optimal regret bound with $\\mathcal{O}(1)$ time and space complexities per round. The core of our method is a tight confidence set for the online mirror descent (OMD) estimator, which is derived through a novel analysis that leverages the notion of mix loss from online prediction. The analysis shows that our OMD estimator, even with its one-pass updates, achieves statistical efficiency comparable to maximum likelihood estimation, thereby leading to a jointly efficient optimistic method.","short_abstract":"We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function, thereby modeling a broad class of reward distributions such as Bernoulli and Poisson. While GLBs are widely applicable to real-world scenar...","url_abs":"https://arxiv.org/abs/2507.11847","url_pdf":"https://arxiv.org/pdf/2507.11847v2","authors":"[\"Yu-Jie Zhang\",\"Sheng-An Xu\",\"Peng Zhao\",\"Masashi Sugiyama\"]","published":"2025-07-16T02:24:21Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"stat.ML\"]","methods":"[]","has_code":false}
