{"ID":2863985,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.25509","arxiv_id":"2509.25509","title":"Can Molecular Foundation Models Know What They Don't Know? A Simple Remedy with Preference Optimization","abstract":"Molecular foundation models are rapidly advancing scientific discovery, but their unreliability on out-of-distribution (OOD) samples severely limits their application in high-stakes domains such as drug discovery and protein design. A critical failure mode is chemical hallucination, where models make high-confidence yet entirely incorrect predictions for unknown molecules. To address this challenge, we introduce Molecular Preference-Aligned Instance Ranking (Mole-PAIR), a simple, plug-and-play module that can be flexibly integrated with existing foundation models to improve their reliability on OOD data through cost-effective post-training. Specifically, our method formulates the OOD detection problem as a preference optimization over the estimated OOD affinity between in-distribution (ID) and OOD samples, achieving this goal through a pairwise learning objective. We show that this objective essentially optimizes AUROC, which measures how consistently ID and OOD samples are ranked by the model. Extensive experiments across five real-world molecular datasets demonstrate that our approach significantly improves the OOD detection capabilities of existing molecular foundation models, achieving up to 45.8%, 43.9%, and 24.3% improvements in AUROC under distribution shifts of size, scaffold, and assay, respectively.","short_abstract":"Molecular foundation models are rapidly advancing scientific discovery, but their unreliability on out-of-distribution (OOD) samples severely limits their application in high-stakes domains such as drug discovery and protein design. A critical failure mode is chemical hallucination, where models make high-confidence ye...","url_abs":"https://arxiv.org/abs/2509.25509","url_pdf":"https://arxiv.org/pdf/2509.25509v1","authors":"[\"Langzhou He\",\"Junyou Zhu\",\"Fangxin Wang\",\"Junhua Liu\",\"Haoyan Xu\",\"Yue Zhao\",\"Philip S. Yu\",\"Qitian Wu\"]","published":"2025-09-29T21:06:52Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"q-bio.QM\"]","methods":"[]","has_code":false}
