{"ID":2864551,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.02342","arxiv_id":"2510.02342","title":"CATMark: A Context-Aware Thresholding Framework for Robust Cross-Task Watermarking in Large Language Models","abstract":"Watermarking algorithms for Large Language Models (LLMs) effectively identify machine-generated content by embedding and detecting hidden statistical features in text. However, such embedding leads to a decline in text quality, especially in low-entropy scenarios where performance needs improvement. Existing methods that rely on entropy thresholds often require significant computational resources for tuning and demonstrate poor adaptability to unknown or cross-task generation scenarios. We propose \\textbf{C}ontext-\\textbf{A}ware \\textbf{T}hreshold watermarking ($\\myalgo$), a novel framework that dynamically adjusts watermarking intensity based on real-time semantic context. $\\myalgo$ partitions text generation into semantic states using logits clustering, establishing context-aware entropy thresholds that preserve fidelity in structured content while embedding robust watermarks. Crucially, it requires no pre-defined thresholds or task-specific tuning. Experiments show $\\myalgo$ improves text quality in cross-tasks without sacrificing detection accuracy.","short_abstract":"Watermarking algorithms for Large Language Models (LLMs) effectively identify machine-generated content by embedding and detecting hidden statistical features in text. However, such embedding leads to a decline in text quality, especially in low-entropy scenarios where performance needs improvement. Existing methods th...","url_abs":"https://arxiv.org/abs/2510.02342","url_pdf":"https://arxiv.org/pdf/2510.02342v1","authors":"[\"Yu Zhang\",\"Shuliang Liu\",\"Xu Yang\",\"Xuming Hu\"]","published":"2025-09-27T03:43:52Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.AI\",\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
