Saved in:
Bibliographic Details
Main Authors: Huang, Lin, Jiang, Arthur, Liu, XiaoLi, Wang, Zion, Zhao, Jason, Wang, Chu, Lu, HaoCheng, Huang, ChengXiang, Cheng, JiaJun, Du, YiYue, Zhang, Jia
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.17709
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917402631995392
author Huang, Lin
Jiang, Arthur
Liu, XiaoLi
Wang, Zion
Zhao, Jason
Wang, Chu
Lu, HaoCheng
Huang, ChengXiang
Cheng, JiaJun
Du, YiYue
Zhang, Jia
author_facet Huang, Lin
Jiang, Arthur
Liu, XiaoLi
Wang, Zion
Zhao, Jason
Wang, Chu
Lu, HaoCheng
Huang, ChengXiang
Cheng, JiaJun
Du, YiYue
Zhang, Jia
contents All-atom molecular simulation serves as a quintessential ``computational microscope'' for understanding the machinery of life, yet it remains fundamentally limited by the trade-off between quantum-mechanical (QM) accuracy and biological scale. We present UBio-MolFM, a universal foundation model framework specifically engineered to bridge this gap. UBio-MolFM introduces three synergistic innovations: (1) UBio-Mol26, a large bio-specific dataset constructed via a multi-fidelity ``Two-Pronged Strategy'' that combines systematic bottom-up enumeration with top-down sampling of native protein environments (up to 1,200 atoms); (2) E2Former-V2, a linear-scaling equivariant transformer that integrates Equivariant Axis-Aligned Sparsification (EAAS) and Long-Short Range (LSR) modeling to capture non-local physics with up to ~4x higher inference throughput in our large-system benchmarks; and (3) a Three-Stage Curriculum Learning protocol that transitions from energy initialization to energy-force consistency, with force-focused supervision to mitigate energy offsets. Rigorous benchmarking across microscopic forces and macroscopic observables -- including liquid water structure, ionic solvation, and peptide folding -- demonstrates that UBio-MolFM achieves ab initio-level fidelity on large, out-of-distribution biomolecular systems (up to ~1,500 atoms) and realistic MD observables. By reconciling scalability with quantum precision, UBio-MolFM provides a robust, ready-to-use tool for the next generation of computational biology.
format Preprint
id arxiv_https___arxiv_org_abs_2602_17709
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems
Huang, Lin
Jiang, Arthur
Liu, XiaoLi
Wang, Zion
Zhao, Jason
Wang, Chu
Lu, HaoCheng
Huang, ChengXiang
Cheng, JiaJun
Du, YiYue
Zhang, Jia
Chemical Physics
Artificial Intelligence
Biological Physics
All-atom molecular simulation serves as a quintessential ``computational microscope'' for understanding the machinery of life, yet it remains fundamentally limited by the trade-off between quantum-mechanical (QM) accuracy and biological scale. We present UBio-MolFM, a universal foundation model framework specifically engineered to bridge this gap. UBio-MolFM introduces three synergistic innovations: (1) UBio-Mol26, a large bio-specific dataset constructed via a multi-fidelity ``Two-Pronged Strategy'' that combines systematic bottom-up enumeration with top-down sampling of native protein environments (up to 1,200 atoms); (2) E2Former-V2, a linear-scaling equivariant transformer that integrates Equivariant Axis-Aligned Sparsification (EAAS) and Long-Short Range (LSR) modeling to capture non-local physics with up to ~4x higher inference throughput in our large-system benchmarks; and (3) a Three-Stage Curriculum Learning protocol that transitions from energy initialization to energy-force consistency, with force-focused supervision to mitigate energy offsets. Rigorous benchmarking across microscopic forces and macroscopic observables -- including liquid water structure, ionic solvation, and peptide folding -- demonstrates that UBio-MolFM achieves ab initio-level fidelity on large, out-of-distribution biomolecular systems (up to ~1,500 atoms) and realistic MD observables. By reconciling scalability with quantum precision, UBio-MolFM provides a robust, ready-to-use tool for the next generation of computational biology.
title UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems
topic Chemical Physics
Artificial Intelligence
Biological Physics
url https://arxiv.org/abs/2602.17709