{"ID":2877517,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.20143","arxiv_id":"2508.20143","title":"CrystalICL: Enabling In-Context Learning for Crystal Generation","abstract":"Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.","short_abstract":"Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable...","url_abs":"https://arxiv.org/abs/2508.20143","url_pdf":"https://arxiv.org/pdf/2508.20143v1","authors":"[\"Ruobing Wang\",\"Qiaoyu Tan\",\"Yili Wang\",\"Ying Wang\",\"Xin Wang\"]","published":"2025-08-27T07:49:27Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cond-mat.mtrl-sci\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
