:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Delgrange, Florent, Avalos, Raphael, Röpke, Willem
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.12312
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments
by: Delgrange, Florent
Published: (2026)

World Modelling Improves Language Model Agents
by: Guo, Shangmin, et al.
Published: (2025)

Safe Deep Policy Adaptation
by: Xiao, Wenli, et al.
Published: (2023)

DéjàQ: Open-Ended Evolution of Diverse, Learnable and Verifiable Problems
by: Röpke, Willem, et al.
Published: (2026)

SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning
by: Anisimov, Maksim, et al.
Published: (2026)

Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning
by: Speziali, Paolo, et al.
Published: (2026)

SPI-GAN: Denoising Diffusion GANs with Straight-Path Interpolations
by: Jeon, Jinsung, et al.
Published: (2022)

Safe Exploration via Policy Priors
by: Wendl, Manuel, et al.
Published: (2026)

World Models via Policy-Guided Trajectory Diffusion
by: Rigter, Marc, et al.
Published: (2023)

Iterative Batch Reinforcement Learning via Safe Diversified Model-based Policy Search
by: Najib, Amna, et al.
Published: (2024)

SafeAR: Safe Algorithmic Recourse by Risk-Aware Policies
by: Wu, Haochen, et al.
Published: (2023)

Simulus: Combining Improvements in Sample-Efficient World Model Agents
by: Cohen, Lior, et al.
Published: (2025)

Towards Fast Safe Online Reinforcement Learning via Policy Finetuning
by: Chen, Keru, et al.
Published: (2024)

Policy Improvement using Language Feedback Models
by: Zhong, Victor, et al.
Published: (2024)

Reward Weighted Classifier-Free Guidance as Policy Improvement in Autoregressive Models
by: Peysakhovich, Alexander, et al.
Published: (2026)

Deep Active Inference with Diffusion Policy and Multiple Timescale World Model for Real-World Exploration and Navigation
by: Yokozawa, Riko, et al.
Published: (2025)

Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning
by: Chandra, Joydeep, et al.
Published: (2026)

Composing Reinforcement Learning Policies, with Formal Guarantees
by: Delgrange, Florent, et al.
Published: (2024)

StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement
by: Seo, Junwon, et al.
Published: (2026)

Online Planning in POMDPs with State-Requests
by: Avalos, Raphael, et al.
Published: (2024)

Safe Exploration Using Bayesian World Models and Log-Barrier Optimization
by: As, Yarden, et al.
Published: (2024)

SafeMIL: Learning Offline Safe Imitation Policy from Non-Preferred Trajectories
by: Burnwal, Returaj, et al.
Published: (2025)

Provable and Practical In-Context Policy Optimization for Self-Improvement
by: Yu, Tianrun, et al.
Published: (2026)

Blending Imitation and Reinforcement Learning for Robust Policy Improvement
by: Liu, Xuefeng, et al.
Published: (2023)

SafeDreamer: Safe Reinforcement Learning with World Models
by: Huang, Weidong, et al.
Published: (2023)

SOE: Sample-Efficient Robot Policy Self-Improvement via On-Manifold Exploration
by: Jin, Yang, et al.
Published: (2025)

Safe Flow Q-Learning: Offline Safe Reinforcement Learning with Reachability-Based Flow Policies
by: Tayal, Mumuksh, et al.
Published: (2026)

SafePred: A Predictive Guardrail for Computer-Using Agents via World Models
by: Chen, Yurun, et al.
Published: (2026)

Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
by: Lee, Chi-Chang, et al.
Published: (2025)

Active Policy Improvement from Multiple Black-box Oracles
by: Liu, Xuefeng, et al.
Published: (2023)

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
by: Queeney, James, et al.
Published: (2022)

Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026)

Safe Reinforcement Learning for Real-World Engine Control
by: Bedei, Julian, et al.
Published: (2025)

DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning
by: Wang, Xuefeng, et al.
Published: (2024)

Classical and Deep Reinforcement Learning Inventory Control Policies for Pharmaceutical Supply Chains with Perishability and Non-Stationarity
by: Stranieri, Francesco, et al.
Published: (2025)

Policy and World Modeling Co-Training for Language Agents
by: Lu, Ning, et al.
Published: (2026)

Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2023)

Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning
by: Chemingui, Yassine, et al.
Published: (2024)

Model-Based Proactive Cost Generation for Learning Safe Policies Offline with Limited Violation Data
by: Xue, Ruiqi, et al.
Published: (2026)

Vertical Symbolic Regression via Deep Policy Gradient
by: Jiang, Nan, et al.
Published: (2024)