Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Firoz, Jesun, Pellegrini, Franco, Geiger, Mario, Hsu, Darren, Bilbrey, Jenna A., Chou, Han-Yi, Stadler, Maximilian, Hoehnerbach, Markus, Wang, Tingyu, Lin, Dejun, Kucukbenli, Emine, Sprueill, Henry W., Batatia, Ilyes, Xantheas, Sotiris S., Lee, MalSoon, Mundy, Chris, Csanyi, Gabor, Smith, Justin S., Sadayappan, Ponnuswamy, Choudhury, Sutanay
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.10700
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912328937635840
author	Firoz, Jesun Pellegrini, Franco Geiger, Mario Hsu, Darren Bilbrey, Jenna A. Chou, Han-Yi Stadler, Maximilian Hoehnerbach, Markus Wang, Tingyu Lin, Dejun Kucukbenli, Emine Sprueill, Henry W. Batatia, Ilyes Xantheas, Sotiris S. Lee, MalSoon Mundy, Chris Csanyi, Gabor Smith, Justin S. Sadayappan, Ponnuswamy Choudhury, Sutanay
author_facet	Firoz, Jesun Pellegrini, Franco Geiger, Mario Hsu, Darren Bilbrey, Jenna A. Chou, Han-Yi Stadler, Maximilian Hoehnerbach, Markus Wang, Tingyu Lin, Dejun Kucukbenli, Emine Sprueill, Henry W. Batatia, Ilyes Xantheas, Sotiris S. Lee, MalSoon Mundy, Chris Csanyi, Gabor Smith, Justin S. Sadayappan, Ponnuswamy Choudhury, Sutanay
contents	Chemistry Foundation Models (CFMs) that leverage Graph Neural Networks (GNNs) operating on 3D molecular graph structures are becoming indispensable tools for computational chemists and materials scientists. These models facilitate the understanding of matter and the discovery of new molecules and materials. In contrast to GNNs operating on a large homogeneous graphs, GNNs used by CFMs process a large number of geometric graphs of varying sizes, requiring different optimization strategies than those developed for large homogeneous GNNs. This paper presents optimizations for two critical phases of CFM training: data distribution and model training, targeting MACE - a state-of-the-art CFM. We address the challenge of load balancing in data distribution by formulating it as a multi-objective bin packing problem. We propose an iterative algorithm that provides a highly effective, fast, and practical solution, ensuring efficient data distribution. For the training phase, we identify symmetric tensor contraction as the key computational kernel in MACE and optimize this kernel to improve the overall performance. Our combined approach of balanced data distribution and kernel optimization significantly enhances the training process of MACE. Experimental results demonstrate a substantial speedup, reducing per-epoch execution time for training from 12 to 2 minutes on 740 GPUs with a 2.6M sample dataset.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_10700
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE Firoz, Jesun Pellegrini, Franco Geiger, Mario Hsu, Darren Bilbrey, Jenna A. Chou, Han-Yi Stadler, Maximilian Hoehnerbach, Markus Wang, Tingyu Lin, Dejun Kucukbenli, Emine Sprueill, Henry W. Batatia, Ilyes Xantheas, Sotiris S. Lee, MalSoon Mundy, Chris Csanyi, Gabor Smith, Justin S. Sadayappan, Ponnuswamy Choudhury, Sutanay Distributed, Parallel, and Cluster Computing Artificial Intelligence Chemistry Foundation Models (CFMs) that leverage Graph Neural Networks (GNNs) operating on 3D molecular graph structures are becoming indispensable tools for computational chemists and materials scientists. These models facilitate the understanding of matter and the discovery of new molecules and materials. In contrast to GNNs operating on a large homogeneous graphs, GNNs used by CFMs process a large number of geometric graphs of varying sizes, requiring different optimization strategies than those developed for large homogeneous GNNs. This paper presents optimizations for two critical phases of CFM training: data distribution and model training, targeting MACE - a state-of-the-art CFM. We address the challenge of load balancing in data distribution by formulating it as a multi-objective bin packing problem. We propose an iterative algorithm that provides a highly effective, fast, and practical solution, ensuring efficient data distribution. For the training phase, we identify symmetric tensor contraction as the key computational kernel in MACE and optimize this kernel to improve the overall performance. Our combined approach of balanced data distribution and kernel optimization significantly enhances the training process of MACE. Experimental results demonstrate a substantial speedup, reducing per-epoch execution time for training from 12 to 2 minutes on 740 GPUs with a 2.6M sample dataset.
title	Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE
topic	Distributed, Parallel, and Cluster Computing Artificial Intelligence
url	https://arxiv.org/abs/2504.10700

Similar Items