Saved in:
Bibliographic Details
Main Authors: Veljković, Tin Hadži, Rosenthal, Joshua, Lončarić, Ivor, van de Meent, Jan-Willem
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.02270
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911563566284800
author Veljković, Tin Hadži
Rosenthal, Joshua
Lončarić, Ivor
van de Meent, Jan-Willem
author_facet Veljković, Tin Hadži
Rosenthal, Joshua
Lončarić, Ivor
van de Meent, Jan-Willem
contents Generative models for crystalline materials often rely on equivariant graph neural networks, which capture geometric structure well but are costly to train and slow to sample. We present Crystalite, a lightweight diffusion Transformer for crystal modeling built around two simple inductive biases. The first is Subatomic Tokenization, a compact chemically structured atom representation that replaces high-dimensional one-hot encodings and is better suited to continuous diffusion. The second is the Geometry Enhancement Module (GEM), which injects periodic minimum-image pair geometry directly into attention through additive geometric biases. Together, these components preserve the simplicity and efficiency of a standard Transformer while making it better matched to the structure of crystalline materials. Crystalite achieves state-of-the-art results on crystal structure prediction benchmarks, and de novo generation performance, attaining the best S.U.N. discovery score among the evaluated baselines while sampling substantially faster than geometry-heavy alternatives.
format Preprint
id arxiv_https___arxiv_org_abs_2604_02270
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Crystalite: A Lightweight Transformer for Efficient Crystal Modeling
Veljković, Tin Hadži
Rosenthal, Joshua
Lončarić, Ivor
van de Meent, Jan-Willem
Machine Learning
Artificial Intelligence
Generative models for crystalline materials often rely on equivariant graph neural networks, which capture geometric structure well but are costly to train and slow to sample. We present Crystalite, a lightweight diffusion Transformer for crystal modeling built around two simple inductive biases. The first is Subatomic Tokenization, a compact chemically structured atom representation that replaces high-dimensional one-hot encodings and is better suited to continuous diffusion. The second is the Geometry Enhancement Module (GEM), which injects periodic minimum-image pair geometry directly into attention through additive geometric biases. Together, these components preserve the simplicity and efficiency of a standard Transformer while making it better matched to the structure of crystalline materials. Crystalite achieves state-of-the-art results on crystal structure prediction benchmarks, and de novo generation performance, attaining the best S.U.N. discovery score among the evaluated baselines while sampling substantially faster than geometry-heavy alternatives.
title Crystalite: A Lightweight Transformer for Efficient Crystal Modeling
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2604.02270