{"ID":2898712,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.02541","arxiv_id":"2507.02541","title":"Clarifying Before Reasoning: A Coq Prover with Structural Context","abstract":"In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85$\\times$ improvement in clarity score (44.5\\%~$\\rightarrow$~82.3\\%). Using the general-purpose model \\texttt{DeepSeek-V3}, our approach leads to a 2.1$\\times$ improvement in proof success (21.8\\%~$\\rightarrow$~45.8\\%) and outperforms the previous state-of-the-art \\texttt{Graph2Tac} (33.2\\%). We evaluate this on 1,386 theorems randomly sampled from 15 standard Coq packages, following the same evaluation protocol as \\texttt{Graph2Tac}. Furthermore, fine-tuning smaller models on our structured data can achieve even higher performance (48.6\\%). Our method uses selective concept unfolding to enrich task descriptions, and employs a Planner--Executor architecture. These findings highlight the value of structured task representations in bridging the gap between understanding and reasoning.","short_abstract":"In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85...","url_abs":"https://arxiv.org/abs/2507.02541","url_pdf":"https://arxiv.org/pdf/2507.02541v1","authors":"[\"Yanzhen Lu\",\"Hanbin Yang\",\"Xiaodie Wang\",\"Ge Zhang\",\"Biao Li\",\"Chenxu Fu\",\"Chao Li\",\"Yang Yuan\",\"Andrew Chi-Chih Yao\"]","published":"2025-07-03T11:35:34Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
