{"ID":2921558,"CreatedAt":"2026-06-02T02:42:49.606572591Z","UpdatedAt":"2026-06-03T03:09:48.883664427Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.00984","arxiv_id":"2606.00984","title":"Practical and Optimal Algorithm for Linear Contextual Bandits with Rare Parameter Updates","abstract":"We study linear contextual bandits under rare parameter updates: the learner may incorporate reward feedback into its parameter estimate only at a small number of update times, while still observing contexts online and selecting actions sequentially. This viewpoint clarifies a practical distinction that is often blurred in the literature: many \"strictly batched\" methods additionally restrict within-interval context adaptivity, meaning that the action rule inside an interval cannot depend on the sequence of realized contexts/actions in that interval (beyond the current round's context). For linear contextual bandits, we propose two practical algorithms with only $O(\\log\\log T)$ parameter updates. Our first algorithm BLCE-G attains minimax-optimal regret (up to polylogarithmic factors in $T$) simultaneously in both the small-$K$ and large-$K$ regimes under a static schedule. Our second algorithm BLCE removes the near G-optimal design step -- a dominant computational bottleneck in prior strictly batched static-grid methods -- yet preserves minimax-optimal regret and achieves the lowest known runtime complexity among optimal algorithms. We further extend these rare-update and computational principles to generalized linear contextual bandits. Overall, our results yield statistically optimal algorithms under $O(\\log\\log T)$ parameter updates that are also computationally efficient in practice.","short_abstract":"We study linear contextual bandits under rare parameter updates: the learner may incorporate reward feedback into its parameter estimate only at a small number of update times, while still observing contexts online and selecting actions sequentially. This viewpoint clarifies a practical distinction that is often blurre...","url_abs":"https://arxiv.org/abs/2606.00984","url_pdf":"https://arxiv.org/pdf/2606.00984v1","authors":"[\"Sanghoon Yu\",\"Min-hwan Oh\"]","published":"2026-05-31T03:46:16Z","proceeding":"stat.ML","tasks":"[\"stat.ML\",\"cs.LG\"]","methods":"[]","has_code":false}
