{"ID":2833287,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.03420","arxiv_id":"2512.03420","title":"HarnessAgent: Scaling Automatic Fuzzing Harness Construction with Tool-Augmented LLM Pipelines","abstract":"Large language model (LLM)-based techniques have achieved notable progress in generating harnesses for program fuzzing. However, applying them to arbitrary functions (especially internal functions) \\textit{at scale} remains challenging due to the requirement of sophisticated contextual information, such as specification, dependencies, and usage examples. State-of-the-art methods heavily rely on static or incomplete context provisioning, causing failure of generating functional harnesses. Furthermore, LLMs tend to exploit harness validation metrics, producing plausible yet logically useless code. % Therefore, harness generation across large and diverse projects continues to face challenges in reliable compilation, robust code retrieval, and comprehensive validation. To address these challenges, we present HarnessAgent, a tool-augmented agentic framework that achieves fully automated, scalable harness construction over hundreds of OSS-Fuzz targets. HarnessAgent introduces three key innovations: 1) a rule-based strategy to identify and minimize various compilation errors; 2) a hybrid tool pool for precise and robust symbol source code retrieval; and 3) an enhanced harness validation pipeline that detects fake definitions. We evaluate HarnessAgent on 243 target functions from OSS-Fuzz projects (65 C projects and 178 C++ projects). It improves the three-shot success rate by approximately 20\\% compared to state-of-the-art techniques, reaching 87\\% for C and 81\\% for C++. Our one-hour fuzzing results show that more than 75\\% of the harnesses generated by HarnessAgent increase the target function coverage, surpassing the baselines by over 10\\%. In addition, the hybrid tool-pool system of HarnessAgent achieves a response rate of over 90\\% for source code retrieval, outperforming Fuzz Introspector by more than 30\\%.","short_abstract":"Large language model (LLM)-based techniques have achieved notable progress in generating harnesses for program fuzzing. However, applying them to arbitrary functions (especially internal functions) \\textit{at scale} remains challenging due to the requirement of sophisticated contextual information, such as specificatio...","url_abs":"https://arxiv.org/abs/2512.03420","url_pdf":"https://arxiv.org/pdf/2512.03420v3","authors":"[\"Kang Yang\",\"Yunhang Zhang\",\"Zichuan Li\",\"Guanhong Tao\",\"Jun Xu\",\"Xiaojing Liao\"]","published":"2025-12-03T03:55:09Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.SE\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
