Saved in:
| Main Author: | Yang, Sherry |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.00785 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning Interactive Real-World Simulators
by: Yang, Sherry, et al.
Published: (2023)
by: Yang, Sherry, et al.
Published: (2023)
WorldGym: World Model as An Environment for Policy Evaluation
by: Quevedo, Julian, et al.
Published: (2025)
by: Quevedo, Julian, et al.
Published: (2025)
World-Gymnast: Training Robots with Reinforcement Learning in a World Model
by: Sharma, Ansh Kumar, et al.
Published: (2026)
by: Sharma, Ansh Kumar, et al.
Published: (2026)
TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks
by: Chu, Zhaoyang, et al.
Published: (2026)
by: Chu, Zhaoyang, et al.
Published: (2026)
Video as the New Language for Real-World Decision Making
by: Yang, Sherry, et al.
Published: (2024)
by: Yang, Sherry, et al.
Published: (2024)
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
by: Dong, Guanting, et al.
Published: (2026)
by: Dong, Guanting, et al.
Published: (2026)
Smart Language Agents in Real-World Planning
by: Miin, Annabelle, et al.
Published: (2024)
by: Miin, Annabelle, et al.
Published: (2024)
MAPF-World: Action World Model for Multi-Agent Path Finding
by: Yang, Zhanjiang, et al.
Published: (2025)
by: Yang, Zhanjiang, et al.
Published: (2025)
Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning
by: Ma, Zhenyu, et al.
Published: (2026)
by: Ma, Zhenyu, et al.
Published: (2026)
ResearchGym: Evaluating Language Model Agents on Real-World AI Research
by: Garikaparthi, Aniketh, et al.
Published: (2026)
by: Garikaparthi, Aniketh, et al.
Published: (2026)
How Well Does Agent Development Reflect Real-World Work?
by: Wang, Zora Zhiruo, et al.
Published: (2026)
by: Wang, Zora Zhiruo, et al.
Published: (2026)
Patient-Zero: Scaling Synthetic Patient Agents to Real-World Distributions without Real Patient Data
by: Lai, Yunghwei, et al.
Published: (2025)
by: Lai, Yunghwei, et al.
Published: (2025)
DeliveryBench: Can Agents Earn Profit in Real World?
by: Mao, Lingjun, et al.
Published: (2025)
by: Mao, Lingjun, et al.
Published: (2025)
MobileWorldBench: Towards Semantic World Modeling For Mobile Agents
by: Li, Shufan, et al.
Published: (2025)
by: Li, Shufan, et al.
Published: (2025)
Ask-before-Plan: Proactive Language Agents for Real-World Planning
by: Zhang, Xuan, et al.
Published: (2024)
by: Zhang, Xuan, et al.
Published: (2024)
Evaluating Privilege Usage of Agents with Real-World Tools
by: Zhang, Quan, et al.
Published: (2026)
by: Zhang, Quan, et al.
Published: (2026)
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges
by: Qian, Cheng, et al.
Published: (2025)
by: Qian, Cheng, et al.
Published: (2025)
WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents
by: Zhou, Siyu, et al.
Published: (2024)
by: Zhou, Siyu, et al.
Published: (2024)
Embodied AI Agents: Modeling the World
by: Fung, Pascale, et al.
Published: (2025)
by: Fung, Pascale, et al.
Published: (2025)
PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments
by: Liu, Ruoqi, et al.
Published: (2026)
by: Liu, Ruoqi, et al.
Published: (2026)
NEWSAGENT: Benchmarking Multimodal Agents as Journalists with Real-World Newswriting Tasks
by: Chien, Yen-Che, et al.
Published: (2025)
by: Chien, Yen-Che, et al.
Published: (2025)
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use
by: Lu, Jiaxuan, et al.
Published: (2026)
by: Lu, Jiaxuan, et al.
Published: (2026)
FutureWorld: A Live Reinforcement Learning Environment for Predictive Agents with Real-World Outcome Rewards
by: Han, Zhixin, et al.
Published: (2026)
by: Han, Zhixin, et al.
Published: (2026)
From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World
by: Conde, Pedro, et al.
Published: (2026)
by: Conde, Pedro, et al.
Published: (2026)
OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)
GUI-Robust: A Comprehensive Dataset for Testing GUI Agent Robustness in Real-World Anomalies
by: Yang, Jingqi, et al.
Published: (2025)
by: Yang, Jingqi, et al.
Published: (2025)
Jenius Agent: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios
by: Xia, Defei, et al.
Published: (2026)
by: Xia, Defei, et al.
Published: (2026)
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company
by: Yu, Zhengxu, et al.
Published: (2026)
by: Yu, Zhengxu, et al.
Published: (2026)
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
by: Guo, Junliang, et al.
Published: (2025)
by: Guo, Junliang, et al.
Published: (2025)
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
by: Wang, Wenhao, et al.
Published: (2026)
by: Wang, Wenhao, et al.
Published: (2026)
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
by: Zhou, Siyu, et al.
Published: (2025)
by: Zhou, Siyu, et al.
Published: (2025)
AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
by: Li, Keyu, et al.
Published: (2026)
by: Li, Keyu, et al.
Published: (2026)
SWE-Next: Scalable Real-World Software Engineering Tasks for Agents
by: Liang, Jiarong, et al.
Published: (2026)
by: Liang, Jiarong, et al.
Published: (2026)
World Modelling Improves Language Model Agents
by: Guo, Shangmin, et al.
Published: (2025)
by: Guo, Shangmin, et al.
Published: (2025)
Simulation Distillation: Pretraining World Models in Simulation for Rapid Real-World Adaptation
by: Levy, Jacob, et al.
Published: (2026)
by: Levy, Jacob, et al.
Published: (2026)
WebWorld: A Large-Scale World Model for Web Agent Training
by: Xiao, Zikai, et al.
Published: (2026)
by: Xiao, Zikai, et al.
Published: (2026)
$τ$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains
by: Ray, Soham, et al.
Published: (2026)
by: Ray, Soham, et al.
Published: (2026)
Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback
by: Hu, Mengkang, et al.
Published: (2025)
by: Hu, Mengkang, et al.
Published: (2025)
Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing
by: Lin, Justin W., et al.
Published: (2025)
by: Lin, Justin W., et al.
Published: (2025)
DataClawBench: An Agent Benchmark for Exploratory Real-World Financial Data Analysis
by: Zhang, Qiaohong, et al.
Published: (2026)
by: Zhang, Qiaohong, et al.
Published: (2026)
Similar Items
-
Learning Interactive Real-World Simulators
by: Yang, Sherry, et al.
Published: (2023) -
WorldGym: World Model as An Environment for Policy Evaluation
by: Quevedo, Julian, et al.
Published: (2025) -
World-Gymnast: Training Robots with Reinforcement Learning in a World Model
by: Sharma, Ansh Kumar, et al.
Published: (2026) -
TerminalWorld: Benchmarking Agents on Real-World Terminal Tasks
by: Chu, Zhaoyang, et al.
Published: (2026) -
Video as the New Language for Real-World Decision Making
by: Yang, Sherry, et al.
Published: (2024)