Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Chaojian, Ye, Zhifan, Pasini, Massimiliano Lupo, Choi, Jong Youl, Wan, Cheng, Lin, Yingyan Celine, Balaprakash, Prasanna
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Materials Science
Online Access:	https://arxiv.org/abs/2504.08112
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916685398671360
author	Li, Chaojian Ye, Zhifan Pasini, Massimiliano Lupo Choi, Jong Youl Wan, Cheng Lin, Yingyan Celine Balaprakash, Prasanna
author_facet	Li, Chaojian Ye, Zhifan Pasini, Massimiliano Lupo Choi, Jong Youl Wan, Cheng Lin, Yingyan Celine Balaprakash, Prasanna
contents	Atomistic materials modeling is a critical task with wide-ranging applications, from drug discovery to materials science, where accurate predictions of the target material property can lead to significant advancements in scientific discovery. Graph Neural Networks (GNNs) represent the state-of-the-art approach for modeling atomistic material data thanks to their capacity to capture complex relational structures. While machine learning performance has historically improved with larger models and datasets, GNNs for atomistic materials modeling remain relatively small compared to large language models (LLMs), which leverage billions of parameters and terabyte-scale datasets to achieve remarkable performance in their respective domains. To address this gap, we explore the scaling limits of GNNs for atomistic materials modeling by developing a foundational model with billions of parameters, trained on extensive datasets in terabyte-scale. Our approach incorporates techniques from LLM libraries to efficiently manage large-scale data and models, enabling both effective training and deployment of these large-scale GNN models. This work addresses three fundamental questions in scaling GNNs: the potential for scaling GNN model architectures, the effect of dataset size on model accuracy, and the applicability of LLM-inspired techniques to GNN architectures. Specifically, the outcomes of this study include (1) insights into the scaling laws for GNNs, highlighting the relationship between model size, dataset volume, and accuracy, (2) a foundational GNN model optimized for atomistic materials modeling, and (3) a GNN codebase enhanced with advanced LLM-based training techniques. Our findings lay the groundwork for large-scale GNNs with billions of parameters and terabyte-scale datasets, establishing a scalable pathway for future advancements in atomistic materials modeling.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_08112
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Scaling Laws of Graph Neural Networks for Atomistic Materials Modeling Li, Chaojian Ye, Zhifan Pasini, Massimiliano Lupo Choi, Jong Youl Wan, Cheng Lin, Yingyan Celine Balaprakash, Prasanna Machine Learning Materials Science Atomistic materials modeling is a critical task with wide-ranging applications, from drug discovery to materials science, where accurate predictions of the target material property can lead to significant advancements in scientific discovery. Graph Neural Networks (GNNs) represent the state-of-the-art approach for modeling atomistic material data thanks to their capacity to capture complex relational structures. While machine learning performance has historically improved with larger models and datasets, GNNs for atomistic materials modeling remain relatively small compared to large language models (LLMs), which leverage billions of parameters and terabyte-scale datasets to achieve remarkable performance in their respective domains. To address this gap, we explore the scaling limits of GNNs for atomistic materials modeling by developing a foundational model with billions of parameters, trained on extensive datasets in terabyte-scale. Our approach incorporates techniques from LLM libraries to efficiently manage large-scale data and models, enabling both effective training and deployment of these large-scale GNN models. This work addresses three fundamental questions in scaling GNNs: the potential for scaling GNN model architectures, the effect of dataset size on model accuracy, and the applicability of LLM-inspired techniques to GNN architectures. Specifically, the outcomes of this study include (1) insights into the scaling laws for GNNs, highlighting the relationship between model size, dataset volume, and accuracy, (2) a foundational GNN model optimized for atomistic materials modeling, and (3) a GNN codebase enhanced with advanced LLM-based training techniques. Our findings lay the groundwork for large-scale GNNs with billions of parameters and terabyte-scale datasets, establishing a scalable pathway for future advancements in atomistic materials modeling.
title	Scaling Laws of Graph Neural Networks for Atomistic Materials Modeling
topic	Machine Learning Materials Science
url	https://arxiv.org/abs/2504.08112

Similar Items