Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Teufel, Felix, Kollasch, Aaron W., Huang, Yining, Winther, Ole, Yang, Kevin K., Notin, Pascal, Marks, Debora S.
Format:	Preprint
Published:	2025
Subjects:	Biomolecules Machine Learning
Online Access:	https://arxiv.org/abs/2512.02315
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Accurately predicting protein fitness with minimal experimental data is a persistent challenge in protein engineering. We introduce PRIMO (PRotein In-context Mutation Oracle), a transformer-based framework that leverages in-context learning and test-time training to adapt rapidly to new proteins and assays without large task-specific datasets. By encoding sequence information, auxiliary zero-shot predictions, and sparse experimental labels from many assays as a unified token set in a pre-training masked-language modeling paradigm, PRIMO learns to prioritize promising variants through a preference-based loss function. Across diverse protein families and properties-including both substitution and indel mutations-PRIMO outperforms zero-shot and fully supervised baselines. This work underscores the power of combining large-scale pre-training with efficient test-time adaptation to tackle challenging protein design tasks where data collection is expensive and label availability is limited.

Similar Items