:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Haoyu, Wang, Jingcheng, Wu, Shunyu, Xiao, Xinwei
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2603.23232
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Selective Uncertainty Propagation in Offline RL
by: Krishnamurthy, Sanath Kumar, et al.
Published: (2023)

Scalable Offline Model-Based RL with Action Chunks
by: Park, Kwanyoung, et al.
Published: (2025)

DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
by: Liu, Jinxin, et al.
Published: (2024)

Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL
by: Luo, Qin-Wen, et al.
Published: (2025)

Allocating Variance to Maximize Expectation
by: Leme, Renato Purita Paes, et al.
Published: (2025)

Offline Behavioral Data Selection
by: Lei, Shiye, et al.
Published: (2025)

HIQL: Offline Goal-Conditioned RL with Latent States as Actions
by: Park, Seohong, et al.
Published: (2023)

Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning
by: Wang, Qi, et al.
Published: (2023)

Uncertainty-Aware Graph Self-Training with Expectation-Maximization Regularization
by: Wang, Emily, et al.
Published: (2025)

Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference
by: Zhou, Xuwen, et al.
Published: (2026)

Action-Free Offline-to-Online RL via Discretised State Policies
by: Neggatu, Natinael Solomon, et al.
Published: (2026)

Augmenting Offline RL with Unlabeled Data
by: Wang, Zhao, et al.
Published: (2024)

Diffusion Alignment as Variational Expectation-Maximization
by: Lee, Jaewoo, et al.
Published: (2025)

Decoupled Prioritized Resampling for Offline RL
by: Yue, Yang, et al.
Published: (2023)

Density Operator Expectation Maximization
by: Vishnu, Adit, et al.
Published: (2025)

Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
by: Liu, Xu-Hui, et al.
Published: (2024)

Behavior-Adaptive Q-Learning: A Unifying Framework for Offline-to-Online RL
by: Zu, Lipeng, et al.
Published: (2025)

DEAS: DEtached value learning with Action Sequence for Scalable Offline RL
by: Kim, Changyeon, et al.
Published: (2025)

Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows
by: Cho, Minjae, et al.
Published: (2024)

Improving and Accelerating Offline RL in Large Discrete Action Spaces with Structured Policy Initialization
by: Landers, Matthew, et al.
Published: (2026)

Reinformer: Max-Return Sequence Modeling for Offline RL
by: Zhuang, Zifeng, et al.
Published: (2024)

EM-Net: Gaze Estimation with Expectation Maximization Algorithm
by: Cheng, Zhang, et al.
Published: (2024)

Fat-to-Thin Policy Optimization: Offline RL with Sparse Policies
by: Zhu, Lingwei, et al.
Published: (2025)

Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces
by: Hu, Jifeng, et al.
Published: (2024)

Deep Generative Clustering with VAEs and Expectation-Maximization
by: Adipoetra, Michael, et al.
Published: (2025)

Massively Parallel Expectation Maximization For Approximate Posteriors
by: Heap, Thomas, et al.
Published: (2025)

General Flexible $f$-divergence for Challenging Offline RL Datasets with Low Stochasticity and Diverse Behavior Policies
by: Wang, Jianxun, et al.
Published: (2026)

Learning to Reason in LLMs by Expectation Maximization
by: Lee, Junghyun, et al.
Published: (2025)

Are Expressive Models Truly Necessary for Offline RL?
by: Wang, Guan, et al.
Published: (2024)

AdamO: A Collapse-Suppressed Optimizer for Offline RL
by: Qiao, Nan, et al.
Published: (2026)

Less is More: Clustered Cross-Covariance Control for Offline RL
by: Qiao, Nan, et al.
Published: (2026)

Improving Offline RL by Blending Heuristics
by: Geng, Sinong, et al.
Published: (2023)

Budgeting Counterfactual for Offline RL
by: Liu, Yao, et al.
Published: (2023)

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
by: Wang, Zhi, et al.
Published: (2024)

Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood
by: Yao, Qingmao, et al.
Published: (2025)

Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
by: Suttle, Wesley A., et al.
Published: (2025)

OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
by: Lim, Yooseok, et al.
Published: (2024)

Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only
by: Xiao, Wei, et al.
Published: (2025)

Expectation Maximization Pseudo Labels
by: Xu, Moucheng, et al.
Published: (2023)

SEMF: Supervised Expectation-Maximization Framework for Predicting Intervals
by: Azizi, Ilia, et al.
Published: (2024)