Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Gao, Bowen, Jia, Yinjun, Mo, Yuanle, Ni, Yuyan, Ma, Weiying, Ma, Zhiming, Lan, Yanyan
Format:	Preprint
Publié:	2023
Sujets:	Machine Learning
Accès en ligne:	https://arxiv.org/abs/2310.07229
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866911790677360640
author	Gao, Bowen Jia, Yinjun Mo, Yuanle Ni, Yuyan Ma, Weiying Ma, Zhiming Lan, Yanyan
author_facet	Gao, Bowen Jia, Yinjun Mo, Yuanle Ni, Yuyan Ma, Weiying Ma, Zhiming Lan, Yanyan
contents	Pocket representations play a vital role in various biomedical applications, such as druggability estimation, ligand affinity prediction, and de novo drug design. While existing geometric features and pretrained representations have demonstrated promising results, they usually treat pockets independent of ligands, neglecting the fundamental interactions between them. However, the limited pocket-ligand complex structures available in the PDB database (less than 100 thousand non-redundant pairs) hampers large-scale pretraining endeavors for interaction modeling. To address this constraint, we propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures, assisted by highly effective pretrained small molecule representations. By segmenting protein structures into drug-like fragments and their corresponding pockets, we obtain a reasonable simulation of ligand-receptor interactions, resulting in the generation of over 5 million complexes. Subsequently, the pocket encoder is trained in a contrastive manner to align with the representation of pseudo-ligand furnished by some pretrained small molecule encoders. Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction, pocket matching, and ligand binding affinity prediction. Notably, ProFSA surpasses other pretraining methods by a substantial margin. Moreover, our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases.
format	Preprint
id	arxiv_https___arxiv_org_abs_2310_07229
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	ProFSA: Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment Gao, Bowen Jia, Yinjun Mo, Yuanle Ni, Yuyan Ma, Weiying Ma, Zhiming Lan, Yanyan Machine Learning Pocket representations play a vital role in various biomedical applications, such as druggability estimation, ligand affinity prediction, and de novo drug design. While existing geometric features and pretrained representations have demonstrated promising results, they usually treat pockets independent of ligands, neglecting the fundamental interactions between them. However, the limited pocket-ligand complex structures available in the PDB database (less than 100 thousand non-redundant pairs) hampers large-scale pretraining endeavors for interaction modeling. To address this constraint, we propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures, assisted by highly effective pretrained small molecule representations. By segmenting protein structures into drug-like fragments and their corresponding pockets, we obtain a reasonable simulation of ligand-receptor interactions, resulting in the generation of over 5 million complexes. Subsequently, the pocket encoder is trained in a contrastive manner to align with the representation of pseudo-ligand furnished by some pretrained small molecule encoders. Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction, pocket matching, and ligand binding affinity prediction. Notably, ProFSA surpasses other pretraining methods by a substantial margin. Moreover, our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases.
title	ProFSA: Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
topic	Machine Learning
url	https://arxiv.org/abs/2310.07229

Documents similaires