Saved in:
Bibliographic Details
Main Authors: Jiang, Zixuan, Xu, Renjing
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.07035
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916784440868864
author Jiang, Zixuan
Xu, Renjing
author_facet Jiang, Zixuan
Xu, Renjing
contents Deciphering protein function remains a fundamental challenge in protein representation learning. The task presents significant difficulties for protein language models (PLMs) due to the sheer volume of functional annotation categories and the highly imbalanced distribution of annotated instances across biological ontologies. Inspired by the remarkable success of reinforcement learning from human feedback (RLHF) in large language model (LLM) alignment, we propose AnnoDPO, a novel multi-modal framework for protein function prediction that leverages Direct Preference Optimization (DPO) to enhance annotation learning. Our methodology addresses the dual challenges of annotation scarcity and category imbalance through preference-aligned training objectives, establishing a new paradigm for biological knowledge integration in protein representation learning.
format Preprint
id arxiv_https___arxiv_org_abs_2506_07035
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle AnnoDPO: Protein Functional Annotation Learning with Direct Preference Optimization
Jiang, Zixuan
Xu, Renjing
Biomolecules
Artificial Intelligence
Deciphering protein function remains a fundamental challenge in protein representation learning. The task presents significant difficulties for protein language models (PLMs) due to the sheer volume of functional annotation categories and the highly imbalanced distribution of annotated instances across biological ontologies. Inspired by the remarkable success of reinforcement learning from human feedback (RLHF) in large language model (LLM) alignment, we propose AnnoDPO, a novel multi-modal framework for protein function prediction that leverages Direct Preference Optimization (DPO) to enhance annotation learning. Our methodology addresses the dual challenges of annotation scarcity and category imbalance through preference-aligned training objectives, establishing a new paradigm for biological knowledge integration in protein representation learning.
title AnnoDPO: Protein Functional Annotation Learning with Direct Preference Optimization
topic Biomolecules
Artificial Intelligence
url https://arxiv.org/abs/2506.07035