{"ID":3084783,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-07T02:02:03.244594148Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.05609","arxiv_id":"2606.05609","title":"SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks","abstract":"As large language models (LLMs) are widely deployed, identifying their vulnerability through jailbreak attacks becomes increasingly critical. Optimization-based attacks like Greedy Coordinate Gradient (GCG) have focused on inserting adversarial tokens to the end of prompts. However, GCG restricts adversarial tokens to a fixed insertion point (typically the prompt suffix), leaving the effect of inserting tokens at other positions unexplored. In this paper, we empirically investigate \\emph{slots}, i.e., candidate positions within a prompt where tokens can be inserted. We find that vulnerability to jailbreaking is highly related to the selection of the \\emph{slots}. Based on these findings, we introduce the \\textit{Vulnerable Slot Score} (VSS) to quantify the positional vulnerability to jailbreaking. We then propose SlotGCG, which evaluates all slots with VSS, selects the most vulnerable slots for insertion, and runs a targeted optimization attack at those slots. Our approach provides a position-search mechanism that is attack-agnostic and can be plugged into any optimization-based attack, adding only 200ms of preprocessing time. Experiments across multiple models demonstrate that SlotGCG significantly outperforms existing methods. Specifically, it achieves 14\\% higher Attack Success Rates (ASR) over GCG-based attacks, converges faster, and shows superior robustness against defense methods with 42\\% higher ASR than baseline approaches. Our implementation is available at \\href{https://github.com/youai058/SlotGCG}{https://github.com/youai058/SlotGCG}","short_abstract":"As large language models (LLMs) are widely deployed, identifying their vulnerability through jailbreak attacks becomes increasingly critical. Optimization-based attacks like Greedy Coordinate Gradient (GCG) have focused on inserting adversarial tokens to the end of prompts. However, GCG restricts adversarial tokens to...","url_abs":"https://arxiv.org/abs/2606.05609","url_pdf":"https://arxiv.org/pdf/2606.05609v1","authors":"[\"Seungwon Jeong\",\"Jiwoo Jeong\",\"Hyeonjin Kim\",\"Yunseok Lee\",\"Woojin Lee\"]","published":"2026-06-04T02:31:29Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":612857,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-05T06:46:15.197025399Z","DeletedAt":null,"paper_id":3084783,"paper_url":"https://arxiv.org/abs/2606.05609","paper_title":"SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks","repo_url":"https://github.com/youai058/SlotGCG","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
