{"ID":2876142,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.25197","arxiv_id":"2509.25197","title":"Towards Repository-Level Program Verification with Large Language Models","abstract":"Recent advancements in large language models (LLMs) suggest great promises in code and proof generations. However, scaling automated formal verification to real-world projects requires resolving cross-module dependencies and global contexts, which are crucial challenges overlooked by existing LLM-based methods with a special focus on targeting isolated, function-level verification tasks. To systematically explore and address the significant challenges of verifying entire software repositories, we introduce RVBench, the first verification benchmark explicitly designed for repository-level evaluation, constructed from four diverse and complex open-source Verus projects. We further introduce RagVerus, an extensible framework that synergizes retrieval-augmented generation with context-aware prompting to automate proof synthesis for multi-module repositories. RagVerus triples proof pass rates on existing benchmarks under constrained model inference budgets, and achieves a 27% relative improvement on the more challenging RVBench benchmark, demonstrating a scalable and sample-efficient verification solution.","short_abstract":"Recent advancements in large language models (LLMs) suggest great promises in code and proof generations. However, scaling automated formal verification to real-world projects requires resolving cross-module dependencies and global contexts, which are crucial challenges overlooked by existing LLM-based methods with a s...","url_abs":"https://arxiv.org/abs/2509.25197","url_pdf":"https://arxiv.org/pdf/2509.25197v1","authors":"[\"Si Cheng Zhong\",\"Xujie Si\"]","published":"2025-08-31T02:44:04Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\",\"cs.PL\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}
