{"ID":2831342,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.08896","arxiv_id":"2512.08896","title":"Open Polymer Challenge: Post-Competition Report","abstract":"Machine learning (ML) offers a powerful path toward discovering sustainable polymer materials, but progress has been limited by the lack of large, high-quality, and openly accessible polymer datasets. The Open Polymer Challenge (OPC) addresses this gap by releasing the first community-developed benchmark for polymer informatics, featuring a dataset with 10K polymers and 5 properties: thermal conductivity, radius of gyration, density, fractional free volume, and glass transition temperature. The challenge centers on multi-task polymer property prediction, a core step in virtual screening pipelines for materials discovery. Participants developed models under realistic constraints that include small data, label imbalance, and heterogeneous simulation sources, using techniques such as feature-based augmentation, transfer learning, self-supervised pretraining, and targeted ensemble strategies. The competition also revealed important lessons about data preparation, distribution shifts, and cross-group simulation consistency, informing best practices for future large-scale polymer datasets. The resulting models, analysis, and released data create a new foundation for molecular AI in polymer science and are expected to accelerate the development of sustainable and energy-efficient materials. Along with the competition, we release the test dataset at https://www.kaggle.com/datasets/alexliu99/neurips-open-polymer-prediction-2025-test-data. We also release the data generation pipeline at https://github.com/sobinalosious/ADEPT, which simulates more than 25 properties, including thermal conductivity, radius of gyration, and density.","short_abstract":"Machine learning (ML) offers a powerful path toward discovering sustainable polymer materials, but progress has been limited by the lack of large, high-quality, and openly accessible polymer datasets. The Open Polymer Challenge (OPC) addresses this gap by releasing the first community-developed benchmark for polymer in...","url_abs":"https://arxiv.org/abs/2512.08896","url_pdf":"https://arxiv.org/pdf/2512.08896v1","authors":"[\"Gang Liu\",\"Sobin Alosious\",\"Subhamoy Mahajan\",\"Eric Inae\",\"Yihan Zhu\",\"Yuhan Liu\",\"Renzheng Zhang\",\"Jiaxin Xu\",\"Addison Howard\",\"Ying Li\",\"Tengfei Luo\",\"Meng Jiang\"]","published":"2025-12-09T18:38:15Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","project_urls":"[\"https://www.kaggle.com/datasets/alexliu99/neurips-open-polymer-prediction-2025-test-data\"]","has_code":false,"code_links":[{"ID":606116,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2831342,"paper_url":"https://arxiv.org/abs/2512.08896","paper_title":"Open Polymer Challenge: Post-Competition Report","repo_url":"https://github.com/sobinalosious/ADEPT","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
