{"ID":2860932,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.02962","arxiv_id":"2510.02962","title":"Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking","abstract":"Large Language Models (LLMs) are increasingly fine-tuned on smaller, domain-specific datasets to improve downstream performance. These datasets often contain proprietary or copyrighted material, raising the need for reliable safeguards against unauthorized use. Existing membership inference attacks (MIAs) and dataset-inference methods typically require access to internal signals such as logits, while current black-box approaches often rely on handcrafted prompts or a clean reference dataset for calibration, both of which limit practical applicability. Watermarking is a promising alternative, but prior techniques can degrade text quality or reduce task performance. We propose TRACE, a practical framework for fully black-box detection of copyrighted dataset usage in LLM fine-tuning. \\texttt{TRACE} rewrites datasets with distortion-free watermarks guided by a private key, ensuring both text quality and downstream utility. At detection time, we exploit the radioactivity effect of fine-tuning on watermarked data and introduce an entropy-gated procedure that selectively scores high-uncertainty tokens, substantially amplifying detection power. Across diverse datasets and model families, TRACE consistently achieves significant detections (p\u003c0.05), often with extremely strong statistical evidence. Furthermore, it supports multi-dataset attribution and remains robust even after continued pretraining on large non-watermarked corpora. These results establish TRACE as a practical route to reliable black-box verification of copyrighted dataset usage. We will make our code available at: https://github.com/NusIoraPrivacy/TRACE.","short_abstract":"Large Language Models (LLMs) are increasingly fine-tuned on smaller, domain-specific datasets to improve downstream performance. These datasets often contain proprietary or copyrighted material, raising the need for reliable safeguards against unauthorized use. Existing membership inference attacks (MIAs) and dataset-i...","url_abs":"https://arxiv.org/abs/2510.02962","url_pdf":"https://arxiv.org/pdf/2510.02962v1","authors":"[\"Jingqi Zhang\",\"Ruibo Chen\",\"Yingqing Yang\",\"Peihua Mai\",\"Heng Huang\",\"Yan Pang\"]","published":"2025-10-03T12:53:02Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":608774,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2860932,"paper_url":"https://arxiv.org/abs/2510.02962","paper_title":"Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking","repo_url":"https://github.com/NusIoraPrivacy/TRACE","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
