Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.02270 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866911563566284800 |
|---|---|
| author | Veljković, Tin Hadži Rosenthal, Joshua Lončarić, Ivor van de Meent, Jan-Willem |
| author_facet | Veljković, Tin Hadži Rosenthal, Joshua Lončarić, Ivor van de Meent, Jan-Willem |
| contents | Generative models for crystalline materials often rely on equivariant graph neural networks, which capture geometric structure well but are costly to train and slow to sample. We present Crystalite, a lightweight diffusion Transformer for crystal modeling built around two simple inductive biases. The first is Subatomic Tokenization, a compact chemically structured atom representation that replaces high-dimensional one-hot encodings and is better suited to continuous diffusion. The second is the Geometry Enhancement Module (GEM), which injects periodic minimum-image pair geometry directly into attention through additive geometric biases. Together, these components preserve the simplicity and efficiency of a standard Transformer while making it better matched to the structure of crystalline materials. Crystalite achieves state-of-the-art results on crystal structure prediction benchmarks, and de novo generation performance, attaining the best S.U.N. discovery score among the evaluated baselines while sampling substantially faster than geometry-heavy alternatives. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_02270 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Crystalite: A Lightweight Transformer for Efficient Crystal Modeling Veljković, Tin Hadži Rosenthal, Joshua Lončarić, Ivor van de Meent, Jan-Willem Machine Learning Artificial Intelligence Generative models for crystalline materials often rely on equivariant graph neural networks, which capture geometric structure well but are costly to train and slow to sample. We present Crystalite, a lightweight diffusion Transformer for crystal modeling built around two simple inductive biases. The first is Subatomic Tokenization, a compact chemically structured atom representation that replaces high-dimensional one-hot encodings and is better suited to continuous diffusion. The second is the Geometry Enhancement Module (GEM), which injects periodic minimum-image pair geometry directly into attention through additive geometric biases. Together, these components preserve the simplicity and efficiency of a standard Transformer while making it better matched to the structure of crystalline materials. Crystalite achieves state-of-the-art results on crystal structure prediction benchmarks, and de novo generation performance, attaining the best S.U.N. discovery score among the evaluated baselines while sampling substantially faster than geometry-heavy alternatives. |
| title | Crystalite: A Lightweight Transformer for Efficient Crystal Modeling |
| topic | Machine Learning Artificial Intelligence |
| url | https://arxiv.org/abs/2604.02270 |