{"ID":2885432,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.05844","arxiv_id":"2508.05844","title":"Online Budget Allocation with Censored Semi-Bandit Feedback","abstract":"We study a stochastic budget-allocation problem over $K$ tasks. At each round $t$, the learner chooses an allocation $X_t \\in Δ_K$. Task $k$ succeeds with probability $F_k(X_{t,k})$, where $F_1,\\dots,F_K$ are nondecreasing budget-to-success curves, and upon success yields a random reward with unknown mean $μ_k$. The learner observes which tasks succeed, and observes a task's reward only upon success (censored semi-bandit feedback). This model captures, for instance, splitting payments across crowdsourcing workers or distributing bids across simultaneous auctions, and subsumes stochastic multi-armed bandits and semi-bandits. We design an optimism-based algorithm that operates under censored semi-bandit feedback. Our main result shows that in diminishing-returns regimes, the regret of this algorithm scales polylogarithmically with the horizon $T$ without any ad hoc tuning. For general nondecreasing curves, we prove that the same algorithm (with the same tuning) achieves a worst-case regret upper bound of $\\tilde O(K\\sqrt{T})$. Finally, we establish a matching worst-case regret lower bound of $Ω(K\\sqrt{T})$ that holds even for full-feedback algorithms, highlighting the intrinsic hardness of our problem outside diminishing returns.","short_abstract":"We study a stochastic budget-allocation problem over $K$ tasks. At each round $t$, the learner chooses an allocation $X_t \\in Δ_K$. Task $k$ succeeds with probability $F_k(X_{t,k})$, where $F_1,\\dots,F_K$ are nondecreasing budget-to-success curves, and upon success yields a random reward with unknown mean $μ_k$. The le...","url_abs":"https://arxiv.org/abs/2508.05844","url_pdf":"https://arxiv.org/pdf/2508.05844v2","authors":"[\"François Bachoc\",\"Nicolò Cesa-Bianchi\",\"Tommaso Cesari\",\"Roberto Colomboni\"]","published":"2025-08-07T20:47:27Z","proceeding":"cs.GT","tasks":"[\"cs.GT\",\"cs.LG\",\"stat.ML\"]","methods":"[]","has_code":false}
