Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Ruobing, Tan, Qiaoyu, Wang, Yili, Wang, Ying, Wang, Xin
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Materials Science
Online Access:	https://arxiv.org/abs/2508.20143
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911127011590144
author	Wang, Ruobing Tan, Qiaoyu Wang, Yili Wang, Ying Wang, Xin
author_facet	Wang, Ruobing Tan, Qiaoyu Wang, Yili Wang, Ying Wang, Xin
contents	Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2508_20143
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	CrystalICL: Enabling In-Context Learning for Crystal Generation Wang, Ruobing Tan, Qiaoyu Wang, Yili Wang, Ying Wang, Xin Machine Learning Materials Science Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.
title	CrystalICL: Enabling In-Context Learning for Crystal Generation
topic	Machine Learning Materials Science
url	https://arxiv.org/abs/2508.20143

Similar Items