{"ID":2826849,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.18531","arxiv_id":"2512.18531","title":"Pushing the limits of one-dimensional NMR spectroscopy for automated structure elucidation using artificial intelligence","abstract":"One-dimensional NMR spectroscopy is one of the most widely used techniques for the characterization of organic compounds and natural products. For molecules with up to 36 non-hydrogen atoms, the number of possible structures has been estimated to range from $10^{20} - 10^{60}$. The task of determining the structure (formula and connectivity) of a molecule of this size using only its one-dimensional $^1$H and/or $^{13}$C NMR spectrum, i.e. de novo structure generation, thus appears completely intractable. Here we show how it is possible to achieve this task for systems with up to 40 non-hydrogen atoms across the full elemental coverage typically encountered in organic chemistry (C, N, O, H, P, S, Si, B, and the halogens) using a deep learning framework, thus covering a vast portion of the drug-like chemical space. Leveraging insights from natural language processing, we show that our transformer-based architecture predicts the correct molecule with 55.2% accuracy within the first 15 predictions using only the $^1$H and $^{13}$C NMR spectra, thus overcoming the combinatorial growth of the chemical space while also being extensible to experimental data via fine-tuning.","short_abstract":"One-dimensional NMR spectroscopy is one of the most widely used techniques for the characterization of organic compounds and natural products. For molecules with up to 36 non-hydrogen atoms, the number of possible structures has been estimated to range from $10^{20} - 10^{60}$. The task of determining the structure (fo...","url_abs":"https://arxiv.org/abs/2512.18531","url_pdf":"https://arxiv.org/pdf/2512.18531v1","authors":"[\"Frank Hu\",\"Jonathan M. Tubb\",\"Dimitris Argyropoulos\",\"Sergey Golotvin\",\"Mikhail Elyashberg\",\"Grant M. Rotskoff\",\"Matthew W. Kanan\",\"Thomas E. Markland\"]","published":"2025-12-20T22:56:49Z","proceeding":"physics.chem-ph","tasks":"[\"physics.chem-ph\",\"cs.LG\"]","methods":"[\"Transformer\",\"Generative Adversarial Network\"]","has_code":false}
