:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Delgrange, Florent
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.23997
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Deep SPI: Safe Policy Improvement via World Models
by: Delgrange, Florent, et al.
Published: (2025)

E-valuator: Reliable Agent Verifiers with Sequential Hypothesis Testing
by: Sadhuka, Shuvom, et al.
Published: (2025)

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents
by: Wang, Bowen, et al.
Published: (2026)

D3-Gym: Constructing Real-World Verifiable Environments for Data-Driven Discovery
by: Moussa, Hanane Nour, et al.
Published: (2026)

DeLF: Designing Learning Environments with Foundation Models
by: Afshar, Aida, et al.
Published: (2024)

Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning
by: Wang, Zhaoyang, et al.
Published: (2026)

Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
by: Wen, Yeming, et al.
Published: (2024)

FutureWorld: A Live Reinforcement Learning Environment for Predictive Agents with Real-World Outcome Rewards
by: Han, Zhixin, et al.
Published: (2026)

GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-Training
by: Cao, Yuan, et al.
Published: (2026)

Domain-Adapted Small Language Models for Reliable Clinical Triage
by: Aljohani, Manar, et al.
Published: (2026)

Benchmarking World-Model Learning with Environment-Level Queries
by: Warrier, Archana, et al.
Published: (2025)

AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations
by: Verma, Gaurav, et al.
Published: (2024)

The Evaluation Game: Beyond Static LLM Benchmarking
by: Wang, Paul, et al.
Published: (2026)

Adapting World Models with Latent-State Dynamics Residuals
by: Lanier, JB, et al.
Published: (2025)

Go Beyond Black-box Policies: Rethinking the Design of Learning Agent for Interpretable and Verifiable HVAC Control
by: An, Zhiyu, et al.
Published: (2024)

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
by: Gunjal, Anisha, et al.
Published: (2025)

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
by: Zala, Abhay, et al.
Published: (2024)

AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
by: Rawles, Christopher, et al.
Published: (2024)

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
by: Stojanovski, Zafir, et al.
Published: (2025)

World Action Verifier: Self-Improving World Models via Forward-Inverse Asymmetry
by: Liu, Yuejiang, et al.
Published: (2026)

Microeconomic Foundations of Multi-Agent Learning
by: Helou, Nassim
Published: (2026)

DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents
by: Zhao, Jiahao, et al.
Published: (2026)

Beyond Static Uncertainty: Modeling Temporal Uncertainty Dynamics for Probabilistic Time Series Forecasting
by: Wang, Yijun, et al.
Published: (2026)

Beyond Verifiable Rewards: Rubric-Based GRM for Reinforced Fine-Tuning SWE Agents
by: Huang, Jiawei, et al.
Published: (2026)

Adapting the Behavior of Reinforcement Learning Agents to Changing Action Spaces and Reward Functions
by: de la Rosa, Raul, et al.
Published: (2026)

Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale
by: Roth, Amit, et al.
Published: (2026)

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
by: Guan, Xinyan, et al.
Published: (2024)

Beyond Binary: Turning Partial Success into Dense Verifiable Rewards for Reinforcement Learning in Code Generation
by: Wang, Longwen, et al.
Published: (2026)

A Guide to Failure in Machine Learning: Reliability and Robustness from Foundations to Practice
by: Heim, Eric, et al.
Published: (2025)

CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments
by: Yu, Yi, et al.
Published: (2026)

External Model Motivated Agents: Reinforcement Learning for Enhanced Environment Sampling
by: Bhagat, Rishav, et al.
Published: (2024)

Towards Adapting Reinforcement Learning Agents to New Tasks: Insights from Q-Values
by: Ramaswamy, Ashwin, et al.
Published: (2024)

An Interactive Agent Foundation Model
by: Durante, Zane, et al.
Published: (2024)

A Reliable Knowledge Processing Framework for Combustion Science using Foundation Models
by: Sharma, Vansh, et al.
Published: (2023)

Raising the Bar in Graph OOD Generalization: Invariant Learning Beyond Explicit Environment Modeling
by: Shen, Xu, et al.
Published: (2025)

RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
by: Zhang, Zijing, et al.
Published: (2025)

PhoneWorld: Scaling Phone-Use Agent Environments
by: Tang, Zhengyang, et al.
Published: (2026)

Diffusion Transformers as Open-World Spatiotemporal Foundation Models
by: Yuan, Yuan, et al.
Published: (2024)

Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
by: Cai, Xin-Qiang, et al.
Published: (2025)

Diffusion World Model: Future Modeling Beyond Step-by-Step Rollout for Offline Reinforcement Learning
by: Ding, Zihan, et al.
Published: (2024)