{"ID":2891328,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.17448","arxiv_id":"2507.17448","title":"Reasoning-Driven Retrosynthesis Prediction with Large Language Models via Reinforcement Learning","abstract":"Retrosynthesis planning, essential in organic synthesis and drug discovery, has greatly benefited from recent AI-driven advancements. Nevertheless, existing methods frequently face limitations in both applicability and explainability. Traditional graph-based and sequence-to-sequence models often lack generalized chemical knowledge, leading to predictions that are neither consistently accurate nor easily explainable. To address these challenges, we introduce RetroDFM-R, a reasoning-based large language model (LLM) designed specifically for chemical retrosynthesis. Leveraging large-scale reinforcement learning guided by chemically verifiable rewards, RetroDFM-R significantly enhances prediction accuracy and explainability. Comprehensive evaluations demonstrate that RetroDFM-R significantly outperforms state-of-the-art methods, achieving a top-1 accuracy of 65.0% on the USPTO-50K benchmark. Double-blind human assessments further validate the chemical plausibility and practical utility of RetroDFM-R's predictions. RetroDFM-R also accurately predicts multistep retrosynthetic routes reported in the literature for both real-world drug molecules and perovskite materials. Crucially, the model's explicit reasoning process provides human-interpretable insights, thereby enhancing trust and practical value in real-world retrosynthesis applications.","short_abstract":"Retrosynthesis planning, essential in organic synthesis and drug discovery, has greatly benefited from recent AI-driven advancements. Nevertheless, existing methods frequently face limitations in both applicability and explainability. Traditional graph-based and sequence-to-sequence models often lack generalized chemic...","url_abs":"https://arxiv.org/abs/2507.17448","url_pdf":"https://arxiv.org/pdf/2507.17448v1","authors":"[\"Situo Zhang\",\"Hanqi Li\",\"Lu Chen\",\"Zihan Zhao\",\"Xuanze Lin\",\"Zichen Zhu\",\"Bo Chen\",\"Xin Chen\",\"Kai Yu\"]","published":"2025-07-23T12:13:06Z","proceeding":"cs.CE","tasks":"[\"cs.CE\",\"cs.AI\",\"physics.chem-ph\"]","methods":"[\"Reinforcement Learning\",\"Large Language Model\",\"Language Model\",\"Generative Adversarial Network\"]","has_code":false}
