{"ID":3084662,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-06T20:04:15.878508328Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.05400","arxiv_id":"2606.05400","title":"LeanMarathon: Toward Reliable AI Co-Mathematicians through Long-Horizon Lean Autoformalization","abstract":"Long-horizon autoformalization of research mathematics fails not only at hard lemmas, but at scale: statements drift, dependencies tangle, context decays, and local repairs corrupt distant work. We present LeanMarathon, a multi-agent harness for reliable research-level Lean autoformalization. Its core abstraction is an evolving blueprint: a Lean file that serves simultaneously as formal proof skeleton, natural-language proof graph, and shared system of record. Four contract-scoped agents construct, audit, prove, and repair this blueprint. These agents are coordinated by a two-stage orchestrator that first stabilizes target fidelity through adversarial review and then discharges the proof directed acyclic graph (DAG) from its dynamic leaves upward in parallel CI-gated rounds. LeanMarathon turns one brittle multi-hour run into many local, recoverable, parallel transactions. We evaluate LeanMarathon on two recent research papers spanning four Erdős problems (#1051, #1196, #164, #1217). Across three autonomous runs, it formalizes all seven target theorems with no sorry, proving 258 lemmas and theorems. These results show that reliable AI co-mathematics requires not only stronger provers, but durable harnesses that preserve target fidelity across long mathematical developments. The code can be found at https://github.com/YuanheZ/LeanMarathon.","short_abstract":"Long-horizon autoformalization of research mathematics fails not only at hard lemmas, but at scale: statements drift, dependencies tangle, context decays, and local repairs corrupt distant work. We present LeanMarathon, a multi-agent harness for reliable research-level Lean autoformalization. Its core abstraction is an...","url_abs":"https://arxiv.org/abs/2606.05400","url_pdf":"https://arxiv.org/pdf/2606.05400v1","authors":"[\"Yuanhe Zhang\",\"Yuekai Sun\",\"Taiji Suzuki\",\"Jason D. Lee\",\"Fanghui Liu\"]","published":"2026-06-03T20:09:39Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.CL\",\"cs.LG\"]","methods":"[]","has_code":false,"code_links":[{"ID":612850,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-05T06:46:15.197025399Z","DeletedAt":null,"paper_id":3084662,"paper_url":"https://arxiv.org/abs/2606.05400","paper_title":"LeanMarathon: Toward Reliable AI Co-Mathematicians through Long-Horizon Lean Autoformalization","repo_url":"https://github.com/YuanheZ/LeanMarathon","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
