Saved in:
| Main Authors: | Delgrange, Florent, Avalos, Raphael, Röpke, Willem |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.12312 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments
by: Delgrange, Florent
Published: (2026)
by: Delgrange, Florent
Published: (2026)
World Modelling Improves Language Model Agents
by: Guo, Shangmin, et al.
Published: (2025)
by: Guo, Shangmin, et al.
Published: (2025)
Safe Deep Policy Adaptation
by: Xiao, Wenli, et al.
Published: (2023)
by: Xiao, Wenli, et al.
Published: (2023)
DéjàQ: Open-Ended Evolution of Diverse, Learnable and Verifiable Problems
by: Röpke, Willem, et al.
Published: (2026)
by: Röpke, Willem, et al.
Published: (2026)
SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning
by: Anisimov, Maksim, et al.
Published: (2026)
by: Anisimov, Maksim, et al.
Published: (2026)
Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning
by: Speziali, Paolo, et al.
Published: (2026)
by: Speziali, Paolo, et al.
Published: (2026)
SPI-GAN: Denoising Diffusion GANs with Straight-Path Interpolations
by: Jeon, Jinsung, et al.
Published: (2022)
by: Jeon, Jinsung, et al.
Published: (2022)
Safe Exploration via Policy Priors
by: Wendl, Manuel, et al.
Published: (2026)
by: Wendl, Manuel, et al.
Published: (2026)
World Models via Policy-Guided Trajectory Diffusion
by: Rigter, Marc, et al.
Published: (2023)
by: Rigter, Marc, et al.
Published: (2023)
Iterative Batch Reinforcement Learning via Safe Diversified Model-based Policy Search
by: Najib, Amna, et al.
Published: (2024)
by: Najib, Amna, et al.
Published: (2024)
SafeAR: Safe Algorithmic Recourse by Risk-Aware Policies
by: Wu, Haochen, et al.
Published: (2023)
by: Wu, Haochen, et al.
Published: (2023)
Simulus: Combining Improvements in Sample-Efficient World Model Agents
by: Cohen, Lior, et al.
Published: (2025)
by: Cohen, Lior, et al.
Published: (2025)
Towards Fast Safe Online Reinforcement Learning via Policy Finetuning
by: Chen, Keru, et al.
Published: (2024)
by: Chen, Keru, et al.
Published: (2024)
Policy Improvement using Language Feedback Models
by: Zhong, Victor, et al.
Published: (2024)
by: Zhong, Victor, et al.
Published: (2024)
Reward Weighted Classifier-Free Guidance as Policy Improvement in Autoregressive Models
by: Peysakhovich, Alexander, et al.
Published: (2026)
by: Peysakhovich, Alexander, et al.
Published: (2026)
Deep Active Inference with Diffusion Policy and Multiple Timescale World Model for Real-World Exploration and Navigation
by: Yokozawa, Riko, et al.
Published: (2025)
by: Yokozawa, Riko, et al.
Published: (2025)
Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning
by: Chandra, Joydeep, et al.
Published: (2026)
by: Chandra, Joydeep, et al.
Published: (2026)
Composing Reinforcement Learning Policies, with Formal Guarantees
by: Delgrange, Florent, et al.
Published: (2024)
by: Delgrange, Florent, et al.
Published: (2024)
StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement
by: Seo, Junwon, et al.
Published: (2026)
by: Seo, Junwon, et al.
Published: (2026)
Online Planning in POMDPs with State-Requests
by: Avalos, Raphael, et al.
Published: (2024)
by: Avalos, Raphael, et al.
Published: (2024)
Safe Exploration Using Bayesian World Models and Log-Barrier Optimization
by: As, Yarden, et al.
Published: (2024)
by: As, Yarden, et al.
Published: (2024)
SafeMIL: Learning Offline Safe Imitation Policy from Non-Preferred Trajectories
by: Burnwal, Returaj, et al.
Published: (2025)
by: Burnwal, Returaj, et al.
Published: (2025)
Provable and Practical In-Context Policy Optimization for Self-Improvement
by: Yu, Tianrun, et al.
Published: (2026)
by: Yu, Tianrun, et al.
Published: (2026)
Blending Imitation and Reinforcement Learning for Robust Policy Improvement
by: Liu, Xuefeng, et al.
Published: (2023)
by: Liu, Xuefeng, et al.
Published: (2023)
SafeDreamer: Safe Reinforcement Learning with World Models
by: Huang, Weidong, et al.
Published: (2023)
by: Huang, Weidong, et al.
Published: (2023)
SOE: Sample-Efficient Robot Policy Self-Improvement via On-Manifold Exploration
by: Jin, Yang, et al.
Published: (2025)
by: Jin, Yang, et al.
Published: (2025)
Safe Flow Q-Learning: Offline Safe Reinforcement Learning with Reachability-Based Flow Policies
by: Tayal, Mumuksh, et al.
Published: (2026)
by: Tayal, Mumuksh, et al.
Published: (2026)
SafePred: A Predictive Guardrail for Computer-Using Agents via World Models
by: Chen, Yurun, et al.
Published: (2026)
by: Chen, Yurun, et al.
Published: (2026)
Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
by: Lee, Chi-Chang, et al.
Published: (2025)
by: Lee, Chi-Chang, et al.
Published: (2025)
Active Policy Improvement from Multiple Black-box Oracles
by: Liu, Xuefeng, et al.
Published: (2023)
by: Liu, Xuefeng, et al.
Published: (2023)
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
by: Queeney, James, et al.
Published: (2022)
by: Queeney, James, et al.
Published: (2022)
Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026)
by: Liang, Haodong, et al.
Published: (2026)
Safe Reinforcement Learning for Real-World Engine Control
by: Bedei, Julian, et al.
Published: (2025)
by: Bedei, Julian, et al.
Published: (2025)
DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning
by: Wang, Xuefeng, et al.
Published: (2024)
by: Wang, Xuefeng, et al.
Published: (2024)
Classical and Deep Reinforcement Learning Inventory Control Policies for Pharmaceutical Supply Chains with Perishability and Non-Stationarity
by: Stranieri, Francesco, et al.
Published: (2025)
by: Stranieri, Francesco, et al.
Published: (2025)
Policy and World Modeling Co-Training for Language Agents
by: Lu, Ning, et al.
Published: (2026)
by: Lu, Ning, et al.
Published: (2026)
Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2023)
by: Yao, Yihang, et al.
Published: (2023)
Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning
by: Chemingui, Yassine, et al.
Published: (2024)
by: Chemingui, Yassine, et al.
Published: (2024)
Model-Based Proactive Cost Generation for Learning Safe Policies Offline with Limited Violation Data
by: Xue, Ruiqi, et al.
Published: (2026)
by: Xue, Ruiqi, et al.
Published: (2026)
Vertical Symbolic Regression via Deep Policy Gradient
by: Jiang, Nan, et al.
Published: (2024)
by: Jiang, Nan, et al.
Published: (2024)
Similar Items
-
Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments
by: Delgrange, Florent
Published: (2026) -
World Modelling Improves Language Model Agents
by: Guo, Shangmin, et al.
Published: (2025) -
Safe Deep Policy Adaptation
by: Xiao, Wenli, et al.
Published: (2023) -
DéjàQ: Open-Ended Evolution of Diverse, Learnable and Verifiable Problems
by: Röpke, Willem, et al.
Published: (2026) -
SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning
by: Anisimov, Maksim, et al.
Published: (2026)