Saved in:
Bibliographic Details
Main Authors: Tuo, Hao, Li, Yan, Hu, Xuanning, Zhao, Haishi, Liu, Xueyan, Yang, Bo
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.13580
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918100932231168
author Tuo, Hao
Li, Yan
Hu, Xuanning
Zhao, Haishi
Liu, Xueyan
Yang, Bo
author_facet Tuo, Hao
Li, Yan
Hu, Xuanning
Zhao, Haishi
Liu, Xueyan
Yang, Bo
contents Combinatorial optimization algorithm is essential in computer-aided drug design by progressively exploring chemical space to design lead compounds with high affinity to target protein. However current methods face inherent challenges in integrating domain knowledge, limiting their performance in identifying lead compounds with novel and valid binding mode. Here, we propose AutoLeadDesign, a lead compounds design framework that inspires extensive domain knowledge encoded in large language models with chemical fragments to progressively implement efficient exploration of vast chemical space. The comprehensive experiments indicate that AutoLeadDesign outperforms baseline methods. Significantly, empirical lead design campaigns targeting two clinically relevant targets (PRMT5 and SARS-CoV-2 PLpro) demonstrate AutoLeadDesign's competence in de novo generation of lead compounds achieving expert-competitive design efficacy. Structural analysis further confirms their mechanism-validated inhibitory patterns. By tracing the process of design, we find that AutoLeadDesign shares analogous mechanisms with fragment-based drug design which traditionally rely on the expert decision-making, further revealing why it works. Overall, AutoLeadDesign offers an efficient approach for lead compounds design, suggesting its potential utility in drug design.
format Preprint
id arxiv_https___arxiv_org_abs_2507_13580
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Collaborative Framework Integrating Large Language Model and Chemical Fragment Space: Mutual Inspiration for Lead Design
Tuo, Hao
Li, Yan
Hu, Xuanning
Zhao, Haishi
Liu, Xueyan
Yang, Bo
Biomolecules
Machine Learning
Combinatorial optimization algorithm is essential in computer-aided drug design by progressively exploring chemical space to design lead compounds with high affinity to target protein. However current methods face inherent challenges in integrating domain knowledge, limiting their performance in identifying lead compounds with novel and valid binding mode. Here, we propose AutoLeadDesign, a lead compounds design framework that inspires extensive domain knowledge encoded in large language models with chemical fragments to progressively implement efficient exploration of vast chemical space. The comprehensive experiments indicate that AutoLeadDesign outperforms baseline methods. Significantly, empirical lead design campaigns targeting two clinically relevant targets (PRMT5 and SARS-CoV-2 PLpro) demonstrate AutoLeadDesign's competence in de novo generation of lead compounds achieving expert-competitive design efficacy. Structural analysis further confirms their mechanism-validated inhibitory patterns. By tracing the process of design, we find that AutoLeadDesign shares analogous mechanisms with fragment-based drug design which traditionally rely on the expert decision-making, further revealing why it works. Overall, AutoLeadDesign offers an efficient approach for lead compounds design, suggesting its potential utility in drug design.
title A Collaborative Framework Integrating Large Language Model and Chemical Fragment Space: Mutual Inspiration for Lead Design
topic Biomolecules
Machine Learning
url https://arxiv.org/abs/2507.13580