{"ID":2890336,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.18975","arxiv_id":"2507.18975","title":"Secure Best Arm Identification in the Presence of a Copycat","abstract":"Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with $K$ arms of dimension $d$. In each arm pull, the player receives a reward that is the sum of the dot product of the arm with an unknown parameter vector and independent noise. The player's goal is to identify the best arm after $T$ arm pulls. Moreover, assume a copycat Chloe is observing the arm pulls. The player wishes to keep Chloe ignorant of the best arm. While a minimax--optimal algorithm identifies the best arm with an $Ω\\left(\\frac{T}{\\log(d)}\\right)$ error exponent, it easily reveals its best-arm estimate to an outside observer, as the best arms are played more frequently. A naive secure algorithm that plays all arms equally results in an $Ω\\left(\\frac{T}{d}\\right)$ exponent. In this paper, we propose a secure algorithm that plays with \\emph{coded arms}. The algorithm does not require any key or cryptographic primitives, yet achieves an $Ω\\left(\\frac{T}{\\log^2(d)}\\right)$ exponent while revealing almost no information on the best arm.","short_abstract":"Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with $K$ arms of dimension $d$. In each arm pull, the player receives a reward that is the sum of the dot product of the arm with an unknown parameter vector and independent noise. The p...","url_abs":"https://arxiv.org/abs/2507.18975","url_pdf":"https://arxiv.org/pdf/2507.18975v2","authors":"[\"Asaf Cohen\",\"Onur Günlü\"]","published":"2025-07-25T06:00:44Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}
