:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Yang, Sherry
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.00785
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning Interactive Real-World Simulators
by: Yang, Sherry, et al.
Published: (2023)

WorldGym: World Model as An Environment for Policy Evaluation
by: Quevedo, Julian, et al.
Published: (2025)

World-Gymnast: Training Robots with Reinforcement Learning in a World Model
by: Sharma, Ansh Kumar, et al.
Published: (2026)

TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks
by: Chu, Zhaoyang, et al.
Published: (2026)

Video as the New Language for Real-World Decision Making
by: Yang, Sherry, et al.
Published: (2024)

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
by: Dong, Guanting, et al.
Published: (2026)

Smart Language Agents in Real-World Planning
by: Miin, Annabelle, et al.
Published: (2024)

MAPF-World: Action World Model for Multi-Agent Path Finding
by: Yang, Zhanjiang, et al.
Published: (2025)

Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning
by: Ma, Zhenyu, et al.
Published: (2026)

ResearchGym: Evaluating Language Model Agents on Real-World AI Research
by: Garikaparthi, Aniketh, et al.
Published: (2026)

How Well Does Agent Development Reflect Real-World Work?
by: Wang, Zora Zhiruo, et al.
Published: (2026)

Patient-Zero: Scaling Synthetic Patient Agents to Real-World Distributions without Real Patient Data
by: Lai, Yunghwei, et al.
Published: (2025)

DeliveryBench: Can Agents Earn Profit in Real World?
by: Mao, Lingjun, et al.
Published: (2025)

MobileWorldBench: Towards Semantic World Modeling For Mobile Agents
by: Li, Shufan, et al.
Published: (2025)

Ask-before-Plan: Proactive Language Agents for Real-World Planning
by: Zhang, Xuan, et al.
Published: (2024)

Evaluating Privilege Usage of Agents with Real-World Tools
by: Zhang, Quan, et al.
Published: (2026)

ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges
by: Qian, Cheng, et al.
Published: (2025)

WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents
by: Zhou, Siyu, et al.
Published: (2024)

Embodied AI Agents: Modeling the World
by: Fung, Pascale, et al.
Published: (2025)

PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments
by: Liu, Ruoqi, et al.
Published: (2026)

NEWSAGENT: Benchmarking Multimodal Agents as Journalists with Real-World Newswriting Tasks
by: Chien, Yen-Che, et al.
Published: (2025)

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use
by: Lu, Jiaxuan, et al.
Published: (2026)

FutureWorld: A Live Reinforcement Learning Environment for Predictive Agents with Real-World Outcome Rewards
by: Han, Zhixin, et al.
Published: (2026)

From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World
by: Conde, Pedro, et al.
Published: (2026)

OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)

GUI-Robust: A Comprehensive Dataset for Testing GUI Agent Robustness in Real-World Anomalies
by: Yang, Jingqi, et al.
Published: (2025)

Jenius Agent: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios
by: Xia, Defei, et al.
Published: (2026)

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company
by: Yu, Zhengxu, et al.
Published: (2026)

MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
by: Guo, Junliang, et al.
Published: (2025)

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
by: Wang, Wenhao, et al.
Published: (2026)

WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
by: Zhou, Siyu, et al.
Published: (2025)

AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
by: Li, Keyu, et al.
Published: (2026)

SWE-Next: Scalable Real-World Software Engineering Tasks for Agents
by: Liang, Jiarong, et al.
Published: (2026)

World Modelling Improves Language Model Agents
by: Guo, Shangmin, et al.
Published: (2025)

Simulation Distillation: Pretraining World Models in Simulation for Rapid Real-World Adaptation
by: Levy, Jacob, et al.
Published: (2026)

WebWorld: A Large-Scale World Model for Web Agent Training
by: Xiao, Zikai, et al.
Published: (2026)

$τ$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains
by: Ray, Soham, et al.
Published: (2026)

Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback
by: Hu, Mengkang, et al.
Published: (2025)

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
by: Lin, Justin W., et al.
Published: (2025)

DataClawBench: An Agent Benchmark for Exploratory Real-World Financial Data Analysis
by: Zhang, Qiaohong, et al.
Published: (2026)