MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Li, Yuetai, Inan, Huseyin A, Yue, Xiang, Chen, Wei-Ning, Wutschitz, Lukas, Kulkarni, Janardhan, Poovendran, Radha, Sim, Robert, Rajmohan, Saravan
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Artificial Intelligence Machine Learning
Accesso online:	https://arxiv.org/abs/2511.01824
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866912685811040256
author	Li, Yuetai Inan, Huseyin A Yue, Xiang Chen, Wei-Ning Wutschitz, Lukas Kulkarni, Janardhan Poovendran, Radha Sim, Robert Rajmohan, Saravan
author_facet	Li, Yuetai Inan, Huseyin A Yue, Xiang Chen, Wei-Ning Wutschitz, Lukas Kulkarni, Janardhan Poovendran, Radha Sim, Robert Rajmohan, Saravan
contents	LLM agents excel in compact environments requiring deep reasoning but remain brittle when operating in broader, more complex contexts that demand robustness across diverse tools and schemas. Building bespoke environments for training is heavy, brittle, and limits progress. In this paper, we demonstrate that LLMs can simulate realistic environment feedback without access to actual testbed data or APIs. Inspired by this capability, we propose two frameworks: Simia-SFT, a pipeline that synthesizes SFT data by amplifying small seed sets into diverse trajectories in an environment-agnostic manner, and Simia-RL, a framework that enables RL training without real environment implementations through LLM-simulated feedback. Fine-tuning open models yields consistent improvements across multiple benchmarks, surpassing GPT-4o and approaching o4-mini on $τ^2$-Bench. Together, Simia-SFT and Simia-RL enable scalable agent training without environment engineering, replacing heavy and brittle implementations with flexible LLM-based simulation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_01824
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Simulating Environments with Reasoning Models for Agent Training Li, Yuetai Inan, Huseyin A Yue, Xiang Chen, Wei-Ning Wutschitz, Lukas Kulkarni, Janardhan Poovendran, Radha Sim, Robert Rajmohan, Saravan Artificial Intelligence Machine Learning LLM agents excel in compact environments requiring deep reasoning but remain brittle when operating in broader, more complex contexts that demand robustness across diverse tools and schemas. Building bespoke environments for training is heavy, brittle, and limits progress. In this paper, we demonstrate that LLMs can simulate realistic environment feedback without access to actual testbed data or APIs. Inspired by this capability, we propose two frameworks: Simia-SFT, a pipeline that synthesizes SFT data by amplifying small seed sets into diverse trajectories in an environment-agnostic manner, and Simia-RL, a framework that enables RL training without real environment implementations through LLM-simulated feedback. Fine-tuning open models yields consistent improvements across multiple benchmarks, surpassing GPT-4o and approaching o4-mini on $τ^2$-Bench. Together, Simia-SFT and Simia-RL enable scalable agent training without environment engineering, replacing heavy and brittle implementations with flexible LLM-based simulation.
title	Simulating Environments with Reasoning Models for Agent Training
topic	Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2511.01824

Documenti analoghi