{"ID":2864506,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.23019","arxiv_id":"2509.23019","title":"LLM Watermark Evasion via Bias Inversion","abstract":"Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion remains an open challenge. Existing query-free attacks often achieve limited success or severely distort semantic meaning. We bridge this gap by theoretically analyzing rewriting-based evasion, demonstrating that reducing the average conditional probability of sampling green tokens by a small margin causes the detection probability to decay exponentially. Guided by this insight, we propose the \\emph{Bias-Inversion Rewriting Attack} (BIRA), a practical query-free method that applies a negative logit bias to a proxy suppression set identified via token surprisal. Empirically, BIRA achieves state-of-the-art evasion rates ($\u003e99\\%$) across diverse watermarking schemes while preserving semantic fidelity substantially better than prior baselines. Our findings reveal a fundamental vulnerability in current watermarking methods and highlight the need for rigorous stress tests. Our code is available at \\href{https://github.com/ml-postech/LLM-Watermark-Evasion-via-Bias-Inversion}{here}.","short_abstract":"Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion remains an open challenge. Existing query-free attacks often achieve limited success or severely distort semantic meaning. We bridge this gap by theoretically analyzing rewriti...","url_abs":"https://arxiv.org/abs/2509.23019","url_pdf":"https://arxiv.org/pdf/2509.23019v5","authors":"[\"Jeongyeon Hwang\",\"Sangdon Park\",\"Jungseul Ok\"]","published":"2025-09-27T00:24:57Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.AI\"]","methods":"[\"Large Language Model\"]","has_code":false,"code_links":[{"ID":609159,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2864506,"paper_url":"https://arxiv.org/abs/2509.23019","paper_title":"LLM Watermark Evasion via Bias Inversion","repo_url":"https://github.com/ml-postech/LLM-Watermark-Evasion-via-Bias-Inversion","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
