{"ID":2850685,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.23637","arxiv_id":"2510.23637","title":"Combining Textual and Structural Information for Premise Selection in Lean","abstract":"Premise selection is a key bottleneck for scaling theorem proving in large formal libraries. Yet existing language-based methods often treat premises in isolation, ignoring the web of dependencies that connects them. We present a graph-augmented approach that combines dense text embeddings of Lean formalizations with graph neural networks over a heterogeneous dependency graph capturing both state-premise and premise-premise relations. On the LeanDojo Benchmark, our method outperforms the ReProver language-based baseline by over 25\\% across standard retrieval metrics. These results suggest that relational information is beneficial for premise selection.","short_abstract":"Premise selection is a key bottleneck for scaling theorem proving in large formal libraries. Yet existing language-based methods often treat premises in isolation, ignoring the web of dependencies that connects them. We present a graph-augmented approach that combines dense text embeddings of Lean formalizations with g...","url_abs":"https://arxiv.org/abs/2510.23637","url_pdf":"https://arxiv.org/pdf/2510.23637v2","authors":"[\"Job Petrovčič\",\"David Eliecer Narvaez Denis\",\"Ljupčo Todorovski\"]","published":"2025-10-24T14:24:13Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.CL\",\"cs.LO\"]","methods":"[\"Graph Neural Network\"]","has_code":false}
