Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Hu, Yutong, Tan, Yang, Han, Andi, Zheng, Lirong, Hong, Liang, Zhou, Bingxin
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2407.07443
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866911950916550656
author Hu, Yutong
Tan, Yang
Han, Andi
Zheng, Lirong
Hong, Liang
Zhou, Bingxin
author_facet Hu, Yutong
Tan, Yang
Han, Andi
Zheng, Lirong
Hong, Liang
Zhou, Bingxin
contents The advent of deep learning has introduced efficient approaches for de novo protein sequence design, significantly improving success rates and reducing development costs compared to computational or experimental methods. However, existing methods face challenges in generating proteins with diverse lengths and shapes while maintaining key structural features. To address these challenges, we introduce CPDiffusion-SS, a latent graph diffusion model that generates protein sequences based on coarse-grained secondary structural information. CPDiffusion-SS offers greater flexibility in producing a variety of novel amino acid sequences while preserving overall structural constraints, thus enhancing the reliability and diversity of generated proteins. Experimental analyses demonstrate the significant superiority of the proposed method in producing diverse and novel sequences, with CPDiffusion-SS surpassing popular baseline methods on open benchmarks across various quantitative measurements. Furthermore, we provide a series of case studies to highlight the biological significance of the generation performance by the proposed method. The source code is publicly available at https://github.com/riacd/CPDiffusion-SS
format Preprint
id arxiv_https___arxiv_org_abs_2407_07443
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Secondary Structure-Guided Novel Protein Sequence Generation with Latent Graph Diffusion
Hu, Yutong
Tan, Yang
Han, Andi
Zheng, Lirong
Hong, Liang
Zhou, Bingxin
Artificial Intelligence
The advent of deep learning has introduced efficient approaches for de novo protein sequence design, significantly improving success rates and reducing development costs compared to computational or experimental methods. However, existing methods face challenges in generating proteins with diverse lengths and shapes while maintaining key structural features. To address these challenges, we introduce CPDiffusion-SS, a latent graph diffusion model that generates protein sequences based on coarse-grained secondary structural information. CPDiffusion-SS offers greater flexibility in producing a variety of novel amino acid sequences while preserving overall structural constraints, thus enhancing the reliability and diversity of generated proteins. Experimental analyses demonstrate the significant superiority of the proposed method in producing diverse and novel sequences, with CPDiffusion-SS surpassing popular baseline methods on open benchmarks across various quantitative measurements. Furthermore, we provide a series of case studies to highlight the biological significance of the generation performance by the proposed method. The source code is publicly available at https://github.com/riacd/CPDiffusion-SS
title Secondary Structure-Guided Novel Protein Sequence Generation with Latent Graph Diffusion
topic Artificial Intelligence
url https://arxiv.org/abs/2407.07443