Saved in:
Bibliographic Details
Main Authors: Teneggi, Jacopo, Turzo, S. M. Bargeen A., Marwah, Tanya, Bietti, Alberto, Renfrew, P. Douglas, Mulligan, Vikram Khipple, Golkar, Siavash
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.15952
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910055249477632
author Teneggi, Jacopo
Turzo, S. M. Bargeen A.
Marwah, Tanya
Bietti, Alberto
Renfrew, P. Douglas
Mulligan, Vikram Khipple
Golkar, Siavash
author_facet Teneggi, Jacopo
Turzo, S. M. Bargeen A.
Marwah, Tanya
Bietti, Alberto
Renfrew, P. Douglas
Mulligan, Vikram Khipple
Golkar, Siavash
contents Large language models (LLMs) are capable of emulating reasoning and using tools, creating opportunities for autonomous agents that execute complex scientific tasks. Protein design provides a natural testbed: although machine learning (ML) methods achieve strong results, these are largely restricted to canonical amino acids and narrow objectives, leaving unfilled need for a generalist tool for broad design pipelines. We introduce Agent Rosetta, an LLM agent paired with a structured environment for operating Rosetta, the leading physics-based heteropolymer design software, capable of modeling non-canonical building blocks and geometries. Agent Rosetta iteratively refines designs to achieve user-defined objectives, combining LLM reasoning with Rosetta's generality. We evaluate Agent Rosetta on design with canonical amino acids, matching specialized models and expert baselines, and with non-canonical residues -- where ML approaches fail -- achieving comparable performance. Critically, prompt engineering alone often fails to generate Rosetta actions, demonstrating that environment design is essential for integrating LLM agents with specialized software. Our results show that properly designed environments enable LLM agents to make scientific software accessible while matching specialized tools and human experts.
format Preprint
id arxiv_https___arxiv_org_abs_2603_15952
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents
Teneggi, Jacopo
Turzo, S. M. Bargeen A.
Marwah, Tanya
Bietti, Alberto
Renfrew, P. Douglas
Mulligan, Vikram Khipple
Golkar, Siavash
Artificial Intelligence
Large language models (LLMs) are capable of emulating reasoning and using tools, creating opportunities for autonomous agents that execute complex scientific tasks. Protein design provides a natural testbed: although machine learning (ML) methods achieve strong results, these are largely restricted to canonical amino acids and narrow objectives, leaving unfilled need for a generalist tool for broad design pipelines. We introduce Agent Rosetta, an LLM agent paired with a structured environment for operating Rosetta, the leading physics-based heteropolymer design software, capable of modeling non-canonical building blocks and geometries. Agent Rosetta iteratively refines designs to achieve user-defined objectives, combining LLM reasoning with Rosetta's generality. We evaluate Agent Rosetta on design with canonical amino acids, matching specialized models and expert baselines, and with non-canonical residues -- where ML approaches fail -- achieving comparable performance. Critically, prompt engineering alone often fails to generate Rosetta actions, demonstrating that environment design is essential for integrating LLM agents with specialized software. Our results show that properly designed environments enable LLM agents to make scientific software accessible while matching specialized tools and human experts.
title Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents
topic Artificial Intelligence
url https://arxiv.org/abs/2603.15952