:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gong, Yuehu, Wang, Zeyuan, Chen, Yulin, Ding, Shutong, Zhou, Qingyuan, Fu, Yanwei
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2603.21621
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Stochastic MeanFlow Policies: One-Step Generative Control with Entropic Mirror Descent
by: Wang, Zeyuan, et al.
Published: (2026)

Distributional Reinforcement Learning with Diffusion Bridge Critics
by: Ding, Shutong, et al.
Published: (2026)

One-Step Generative Policies with Q-Learning: A Reformulation of MeanFlow
by: Wang, Zeyuan, et al.
Published: (2025)

Variational Online Mirror Descent for Robust Learning in Schrödinger Bridge
by: Han, Dong-Sig, et al.
Published: (2025)

One-Step Flow Policy Mirror Descent
by: Chen, Tianyi, et al.
Published: (2025)

Value Mirror Descent for Reinforcement Learning
by: Jia, Zhichao, et al.
Published: (2026)

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
by: Ding, Shutong, et al.
Published: (2025)

Policy Mirror Descent with Lookahead
by: Protopapas, Kimon, et al.
Published: (2024)

On the Effect of Regularization in Policy Mirror Descent
by: Kleuker, Jan Felix, et al.
Published: (2025)

Policy Mirror Descent with Temporal Difference Learning: Sample Complexity under Online Markov Data
by: Li, Wenye, et al.
Published: (2025)

On the Convergence of Policy in Unregularized Policy Mirror Descent
by: Lin, Dachao, et al.
Published: (2022)

Functional Acceleration for Policy Mirror Descent
by: Chelu, Veronica, et al.
Published: (2024)

FlowCritic: Bridging Value Estimation with Flow Matching in Reinforcement Learning
by: Zhong, Shan, et al.
Published: (2025)

Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds
by: Xu, Zhenghao, et al.
Published: (2023)

Mirror Descent on Reproducing Kernel Banach Spaces
by: Kumar, Akash, et al.
Published: (2024)

Mirror and Preconditioned Gradient Descent in Wasserstein Space
by: Bonet, Clément, et al.
Published: (2024)

Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization
by: Chen, Sijia, et al.
Published: (2023)

Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization
by: Ding, Shutong, et al.
Published: (2024)

Flow Matching Policy Optimization with Mirror Descent and Entropy Constraints
by: Gao, Ting, et al.
Published: (2026)

On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
by: Liu, Jiacai, et al.
Published: (2025)

A Mirror Descent Perspective of Smoothed Sign Descent
by: Wang, Shuyang, et al.
Published: (2024)

Convergence of Policy Mirror Descent Beyond Compatible Function Approximation
by: Sherman, Uri, et al.
Published: (2025)

StaQ it! Growing neural networks for Policy Mirror Descent
by: Shilova, Alena, et al.
Published: (2025)

Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parametric Policies
by: Li, Xiang, et al.
Published: (2026)

A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
by: Alfano, Carlo, et al.
Published: (2023)

Never Saddle for Reparameterized Steepest Descent as Mirror Flow
by: Jacobs, Tom, et al.
Published: (2026)

Estimating Individual Dose-Response Curves under Unobserved Confounders from Observational Data
by: Chen, Shutong, et al.
Published: (2024)

Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning
by: Sun, Mingyang, et al.
Published: (2025)

Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
by: Lee, Joongkyu, et al.
Published: (2026)

Parameter-free Mirror Descent
by: Jacobsen, Andrew, et al.
Published: (2022)

Mirror Descent on Riemannian Manifolds
by: Jiang, Jiaxin, et al.
Published: (2026)

Stress-Aware Learning under KL Drift via Trust-Decayed Mirror Descent
by: Raj, Gabriel Nixon
Published: (2025)

Mirror Descent Policy Optimisation for Robust Constrained Markov Decision Processes
by: Bossens, David M., et al.
Published: (2025)

Instance Generation for Meta-Black-Box Optimization through Latent Space Reverse Engineering
by: Wang, Chen, et al.
Published: (2025)

Learning Mixtures of Experts with EM: A Mirror Descent Perspective
by: Fruytier, Quentin, et al.
Published: (2024)

Mirror Descent Actor Critic via Bounded Advantage Learning
by: Iwaki, Ryo
Published: (2025)

Adaptively Perturbed Mirror Descent for Learning in Games
by: Abe, Kenshi, et al.
Published: (2023)

Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance
by: Ding, Shutong, et al.
Published: (2026)

The Hidden Cost of Approximation in Online Mirror Descent
by: Schlisselberg, Ofir, et al.
Published: (2025)