Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zeng, Xinyue, Wang, Tuo, Kulkarni, Adithya, Lu, Alexander, Ni, Alexandra, Xing, Phoebe, Zhao, Junhan, Chen, Siwei, Zhou, Dawei
Format:	Preprint
Published:	2025
Subjects:	Biomolecules Machine Learning
Online Access:	https://arxiv.org/abs/2507.02883
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911435884331008
author	Zeng, Xinyue Wang, Tuo Kulkarni, Adithya Lu, Alexander Ni, Alexandra Xing, Phoebe Zhao, Junhan Chen, Siwei Zhou, Dawei
author_facet	Zeng, Xinyue Wang, Tuo Kulkarni, Adithya Lu, Alexander Ni, Alexandra Xing, Phoebe Zhao, Junhan Chen, Siwei Zhou, Dawei
contents	Intrinsically disordered regions (IDRs) play central roles in cellular function, yet remain poorly evaluated by existing protein structure prediction benchmarks. Current evaluations largely focus on well-folded domains, overlooking three fundamental challenges in realistic biological settings: the structural complexity of proteins, the resulting low availability of reliable ground truth, and prediction uncertainty that can propagate into high-risk downstream failures, such as in drug discovery, protein-protein interaction modeling, and functional annotation. We present DisProtBench, an IDR-centric benchmark that explicitly incorporates prediction uncertainty into the evaluation of protein structure prediction models (PSPMs). To address structural complexity and ground-truth scarcity, we curate and unify a large-scale, multi-modal dataset spanning disease-relevant IDRs, GPCR-ligand interactions, and multimeric protein complexes. To assess predictive uncertainty, we introduce Functional Uncertainty Sensitivity (FUS), a novel prediction uncertainty-stratified metric that quantifies downstream task performance under prediction uncertainty. Using this benchmark, we conduct a systematic evaluation of state-of-the-art PSPMs and reveal clear, task-dependent failure modes. Protein-protein interaction prediction degrades sharply in IDRs, while structure-based drug discovery remains comparatively robust. These effects are largely invisible to standard global accuracy metrics, which overestimate functional reliability under prediction uncertainty. We have open-sourced our benchmark and the codebase at https://github.com/Susan571/DisProtBench.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_02883
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	DISPROTBENCH: Uncovering the Functional Limits of Protein Structure Prediction Models in Intrinsically Disordered Regions Zeng, Xinyue Wang, Tuo Kulkarni, Adithya Lu, Alexander Ni, Alexandra Xing, Phoebe Zhao, Junhan Chen, Siwei Zhou, Dawei Biomolecules Machine Learning Intrinsically disordered regions (IDRs) play central roles in cellular function, yet remain poorly evaluated by existing protein structure prediction benchmarks. Current evaluations largely focus on well-folded domains, overlooking three fundamental challenges in realistic biological settings: the structural complexity of proteins, the resulting low availability of reliable ground truth, and prediction uncertainty that can propagate into high-risk downstream failures, such as in drug discovery, protein-protein interaction modeling, and functional annotation. We present DisProtBench, an IDR-centric benchmark that explicitly incorporates prediction uncertainty into the evaluation of protein structure prediction models (PSPMs). To address structural complexity and ground-truth scarcity, we curate and unify a large-scale, multi-modal dataset spanning disease-relevant IDRs, GPCR-ligand interactions, and multimeric protein complexes. To assess predictive uncertainty, we introduce Functional Uncertainty Sensitivity (FUS), a novel prediction uncertainty-stratified metric that quantifies downstream task performance under prediction uncertainty. Using this benchmark, we conduct a systematic evaluation of state-of-the-art PSPMs and reveal clear, task-dependent failure modes. Protein-protein interaction prediction degrades sharply in IDRs, while structure-based drug discovery remains comparatively robust. These effects are largely invisible to standard global accuracy metrics, which overestimate functional reliability under prediction uncertainty. We have open-sourced our benchmark and the codebase at https://github.com/Susan571/DisProtBench.
title	DISPROTBENCH: Uncovering the Functional Limits of Protein Structure Prediction Models in Intrinsically Disordered Regions
topic	Biomolecules Machine Learning
url	https://arxiv.org/abs/2507.02883

Similar Items