{"ID":2880779,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.13981","arxiv_id":"2508.13981","title":"Multi-User Contextual Cascading Bandits for Personalized Recommendation","abstract":"We introduce a Multi-User Contextual Cascading Bandit model, a new combinatorial bandit framework that captures realistic online advertising scenarios where multiple users interact with sequentially displayed items simultaneously. Unlike classical contextual bandits, MCCB integrates three key structural elements: (i) cascading feedback based on sequential arm exposure, (ii) parallel context sessions enabling selective exploration, and (iii) heterogeneous arm-level rewards. We first propose Upper Confidence Bound with Backward Planning (UCBBP), a UCB-style algorithm tailored to this setting, and prove that it achieves a regret bound of $\\widetilde{O}(\\sqrt{THN})$ over $T$ episodes, $H$ session steps, and $N$ contexts per episode. Motivated by the fact that many users interact with the system simultaneously, we introduce a second algorithm, termed Active Upper Confidence Bound with Backward Planning (AUCBBP), which shows a strict efficiency improvement in context scaling, i.e., user scaling, with a regret bound of $\\widetilde{O}(\\sqrt{T+HN})$. We validate our theoretical findings via numerical experiments, demonstrating the empirical effectiveness of both algorithms under various settings.","short_abstract":"We introduce a Multi-User Contextual Cascading Bandit model, a new combinatorial bandit framework that captures realistic online advertising scenarios where multiple users interact with sequentially displayed items simultaneously. Unlike classical contextual bandits, MCCB integrates three key structural elements: (i) c...","url_abs":"https://arxiv.org/abs/2508.13981","url_pdf":"https://arxiv.org/pdf/2508.13981v2","authors":"[\"Jiho Park\",\"Huiwen Jia\"]","published":"2025-08-19T16:14:33Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.OC\",\"stat.ML\"]","methods":"[\"LoRA\"]","has_code":false}
