Saved in:
| Main Authors: | Ma, Shaocong, Chen, Ziyi, Zhou, Yi, Huang, Heng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.17448 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Robust Reinforcement Learning in Finance: Modeling Market Impact with Elliptic Uncertainty Sets
by: Ma, Shaocong, et al.
Published: (2025)
by: Ma, Shaocong, et al.
Published: (2025)
Distributionally Robust Constrained Reinforcement Learning under Strong Duality
by: Zhang, Zhengfei, et al.
Published: (2024)
by: Zhang, Zhengfei, et al.
Published: (2024)
Achieve Performatively Optimal Policy for Performative Reinforcement Learning
by: Chen, Ziyi, et al.
Published: (2025)
by: Chen, Ziyi, et al.
Published: (2025)
On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization
by: Ma, Shaocong, et al.
Published: (2025)
by: Ma, Shaocong, et al.
Published: (2025)
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
by: Ma, Shaocong, et al.
Published: (2025)
by: Ma, Shaocong, et al.
Published: (2025)
Riemannian Zeroth-Order Gradient Estimation with Structure-Preserving Metrics for Geodesically Incomplete Manifolds
by: Ma, Shaocong, et al.
Published: (2026)
by: Ma, Shaocong, et al.
Published: (2026)
Distributionally Robust Multi-Objective Optimization
by: Yang, Yufeng, et al.
Published: (2026)
by: Yang, Yufeng, et al.
Published: (2026)
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
by: Li, Zihao, et al.
Published: (2024)
by: Li, Zihao, et al.
Published: (2024)
Truncated Rectified Flow Policy for Reinforcement Learning with One-Step Sampling
by: Zhou, Xubin, et al.
Published: (2026)
by: Zhou, Xubin, et al.
Published: (2026)
End-to-End Mesh Optimization of a Hybrid Deep Learning Black-Box PDE Solver
by: Ma, Shaocong, et al.
Published: (2024)
by: Ma, Shaocong, et al.
Published: (2024)
New Hybrid Fine-Tuning Paradigm for LLMs: Algorithm Design and Convergence Analysis Framework
by: Ma, Shaocong, et al.
Published: (2026)
by: Ma, Shaocong, et al.
Published: (2026)
Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning
by: Zhang, Jing, et al.
Published: (2023)
by: Zhang, Jing, et al.
Published: (2023)
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
by: Peng, Xiyue, et al.
Published: (2024)
by: Peng, Xiyue, et al.
Published: (2024)
Adversarial Constrained Policy Optimization: Improving Constrained Reinforcement Learning by Adapting Budgets
by: Ma, Jianmina, et al.
Published: (2024)
by: Ma, Jianmina, et al.
Published: (2024)
Rectifying Regression in Reinforcement Learning
by: Ayoub, Alex, et al.
Published: (2025)
by: Ayoub, Alex, et al.
Published: (2025)
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
by: Hou, Hongru, et al.
Published: (2026)
by: Hou, Hongru, et al.
Published: (2026)
Provably Efficient Algorithms for S- and Non-Rectangular Robust MDPs with General Parameterization
by: Satheesh, Anirudh, et al.
Published: (2026)
by: Satheesh, Anirudh, et al.
Published: (2026)
Zeroth-Order Methods for Stochastic Nonconvex Nonsmooth Composite Optimization
by: Chen, Ziyi, et al.
Published: (2025)
by: Chen, Ziyi, et al.
Published: (2025)
Incentivizing Safer Actions in Policy Optimization for Constrained Reinforcement Learning
by: Hazra, Somnath, et al.
Published: (2025)
by: Hazra, Somnath, et al.
Published: (2025)
Optimal Strong Regret and Violation in Constrained MDPs via Policy Optimization
by: Stradi, Francesco Emanuele, et al.
Published: (2024)
by: Stradi, Francesco Emanuele, et al.
Published: (2024)
Trade-off in Estimating the Number of Byzantine Clients in Federated Learning
by: Chen, Ziyi, et al.
Published: (2025)
by: Chen, Ziyi, et al.
Published: (2025)
Belief-Based Offline Reinforcement Learning for Delay-Robust Policy Optimization
by: Zhan, Simon Sinong, et al.
Published: (2025)
by: Zhan, Simon Sinong, et al.
Published: (2025)
Certifiably Robust Policies for Uncertain Parametric Environments
by: Schnitzer, Yannik, et al.
Published: (2024)
by: Schnitzer, Yannik, et al.
Published: (2024)
Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy
by: Zheng, Xiang, et al.
Published: (2023)
by: Zheng, Xiang, et al.
Published: (2023)
Robust Parameter Learning for Uncertain MDPs
by: Schnitzer, Yannik, et al.
Published: (2026)
by: Schnitzer, Yannik, et al.
Published: (2026)
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
by: Hu, Jifeng, et al.
Published: (2025)
by: Hu, Jifeng, et al.
Published: (2025)
Hypercube Policy Regularization Framework for Offline Reinforcement Learning
by: Shen, Yi, et al.
Published: (2024)
by: Shen, Yi, et al.
Published: (2024)
Mildly Constrained Evaluation Policy for Offline Reinforcement Learning
by: Xu, Linjie, et al.
Published: (2023)
by: Xu, Linjie, et al.
Published: (2023)
Reinforcement Learning-assisted Constraint Relaxation for Constrained Expensive Optimization
by: Zhu, Qianhao, et al.
Published: (2026)
by: Zhu, Qianhao, et al.
Published: (2026)
Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization
by: Zhang, Qi, et al.
Published: (2024)
by: Zhang, Qi, et al.
Published: (2024)
Action Robust Reinforcement Learning via Optimal Adversary Aware Policy Optimization
by: Nie, Buqing, et al.
Published: (2025)
by: Nie, Buqing, et al.
Published: (2025)
Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
by: Alles, Marvin, et al.
Published: (2024)
by: Alles, Marvin, et al.
Published: (2024)
Evaluating and Learning Robust Bandit Policies Under Uncertain Causal Mechanisms
by: Avery, Katherine, et al.
Published: (2025)
by: Avery, Katherine, et al.
Published: (2025)
State-wise Constrained Policy Optimization
by: Zhao, Weiye, et al.
Published: (2023)
by: Zhao, Weiye, et al.
Published: (2023)
Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)
by: Gao, Shiqing, et al.
Published: (2026)
DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning
by: He, Longxiang, et al.
Published: (2023)
by: He, Longxiang, et al.
Published: (2023)
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
by: Liu, Weidong, et al.
Published: (2023)
by: Liu, Weidong, et al.
Published: (2023)
Agentic Reinforced Policy Optimization
by: Dong, Guanting, et al.
Published: (2025)
by: Dong, Guanting, et al.
Published: (2025)
PolicyFlow: Policy Optimization with Continuous Normalizing Flow in Reinforcement Learning
by: Yang, Shunpeng, et al.
Published: (2026)
by: Yang, Shunpeng, et al.
Published: (2026)
Policy Improvement Reinforcement Learning
by: Wang, Huaiyang, et al.
Published: (2026)
by: Wang, Huaiyang, et al.
Published: (2026)
Similar Items
-
Robust Reinforcement Learning in Finance: Modeling Market Impact with Elliptic Uncertainty Sets
by: Ma, Shaocong, et al.
Published: (2025) -
Distributionally Robust Constrained Reinforcement Learning under Strong Duality
by: Zhang, Zhengfei, et al.
Published: (2024) -
Achieve Performatively Optimal Policy for Performative Reinforcement Learning
by: Chen, Ziyi, et al.
Published: (2025) -
On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization
by: Ma, Shaocong, et al.
Published: (2025) -
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
by: Ma, Shaocong, et al.
Published: (2025)