Guardado en:
| Autores principales: | Shen, Kaichen, Chen, Peng |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2601.05868 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Performative Policy Gradient: Optimality in Performative Reinforcement Learning
por: Basu, Debabrota, et al.
Publicado: (2025)
por: Basu, Debabrota, et al.
Publicado: (2025)
Achieve Performatively Optimal Policy for Performative Reinforcement Learning
por: Chen, Ziyi, et al.
Publicado: (2025)
por: Chen, Ziyi, et al.
Publicado: (2025)
Bayesian Design Principles for Frequentist Sequential Learning
por: Xu, Yunbei, et al.
Publicado: (2023)
por: Xu, Yunbei, et al.
Publicado: (2023)
Scalable Bi-causal Optimal Transport via KL Relaxation and Policy Gradients
por: Cao, Haoyang, et al.
Publicado: (2026)
por: Cao, Haoyang, et al.
Publicado: (2026)
Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods
por: Carmona, René, et al.
Publicado: (2019)
por: Carmona, René, et al.
Publicado: (2019)
Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators
por: Han, Yinbin, et al.
Publicado: (2023)
por: Han, Yinbin, et al.
Publicado: (2023)
Benchmarking Reinforcement Learning via Stochastic Converse Optimality: Generating Systems with Known Optimal Policies
por: Ibrahim, Sinan, et al.
Publicado: (2026)
por: Ibrahim, Sinan, et al.
Publicado: (2026)
Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning
por: Zeng, Sihan, et al.
Publicado: (2024)
por: Zeng, Sihan, et al.
Publicado: (2024)
Deceptive Sequential Decision-Making via Regularized Policy Optimization
por: Kim, Yerin, et al.
Publicado: (2025)
por: Kim, Yerin, et al.
Publicado: (2025)
Traversing Pareto Optimal Policies: Provably Efficient Multi-Objective Reinforcement Learning
por: Qiu, Shuang, et al.
Publicado: (2024)
por: Qiu, Shuang, et al.
Publicado: (2024)
Infinite-Horizon Reinforcement Learning with Multinomial Logistic Function Approximation
por: Park, Jaehyun, et al.
Publicado: (2024)
por: Park, Jaehyun, et al.
Publicado: (2024)
Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control
por: Tabas, Sadegh Sadeghi, et al.
Publicado: (2024)
por: Tabas, Sadegh Sadeghi, et al.
Publicado: (2024)
Random Gradient-Free Optimization in Infinite Dimensional Spaces
por: Peixoto, Caio Lins, et al.
Publicado: (2025)
por: Peixoto, Caio Lins, et al.
Publicado: (2025)
Deterministic Policy Gradient for Reinforcement Learning with Continuous Time and State
por: Cheng, Ziheng, et al.
Publicado: (2025)
por: Cheng, Ziheng, et al.
Publicado: (2025)
Wasserstein Formulation of Reinforcement Learning. An Optimal Transport Perspective on Policy Optimization
por: Dus, Mathias
Publicado: (2026)
por: Dus, Mathias
Publicado: (2026)
Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
por: Angiuli, Andrea, et al.
Publicado: (2023)
por: Angiuli, Andrea, et al.
Publicado: (2023)
Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning
por: Li, Jingqi, et al.
Publicado: (2022)
por: Li, Jingqi, et al.
Publicado: (2022)
Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence
por: Xiao, Minheng, et al.
Publicado: (2024)
por: Xiao, Minheng, et al.
Publicado: (2024)
Natural Policy Gradient as Doubly Smoothed Policy Iteration: A Bellman-Operator Framework
por: Nanda, Phalguni, et al.
Publicado: (2026)
por: Nanda, Phalguni, et al.
Publicado: (2026)
Effective Dimension Aware Fractional-Order Stochastic Gradient Descent for Convex Optimization Problems
por: Partohaghighi, Mohammad, et al.
Publicado: (2025)
por: Partohaghighi, Mohammad, et al.
Publicado: (2025)
Model-Free Output Feedback Stabilization via Policy Gradient Methods
por: Zhang, Ankang, et al.
Publicado: (2026)
por: Zhang, Ankang, et al.
Publicado: (2026)
Hierarchical Reinforcement Learning Framework for Stochastic Spaceflight Campaign Design
por: Takubo, Yuji, et al.
Publicado: (2021)
por: Takubo, Yuji, et al.
Publicado: (2021)
Transfer Learning in Bayesian Optimization for Aircraft Design
por: Tfaily, Ali, et al.
Publicado: (2026)
por: Tfaily, Ali, et al.
Publicado: (2026)
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
por: Shen, Han, et al.
Publicado: (2024)
por: Shen, Han, et al.
Publicado: (2024)
Asynchronous Policy Gradient Aggregation for Efficient Distributed Reinforcement Learning
por: Tyurin, Alexander, et al.
Publicado: (2025)
por: Tyurin, Alexander, et al.
Publicado: (2025)
Optimal Variance-Dependent Regret Bounds for Infinite-Horizon MDPs
por: Zamir, Guy, et al.
Publicado: (2026)
por: Zamir, Guy, et al.
Publicado: (2026)
Recurrent Natural Policy Gradient for POMDPs
por: Cayci, Semih, et al.
Publicado: (2024)
por: Cayci, Semih, et al.
Publicado: (2024)
Elementary Analysis of Policy Gradient Methods
por: Liu, Jiacai, et al.
Publicado: (2024)
por: Liu, Jiacai, et al.
Publicado: (2024)
An Efficient On-Policy Deep Learning Framework for Stochastic Optimal Control
por: Hua, Mengjian, et al.
Publicado: (2024)
por: Hua, Mengjian, et al.
Publicado: (2024)
A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy
por: Lyu, Jiameng, et al.
Publicado: (2024)
por: Lyu, Jiameng, et al.
Publicado: (2024)
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
por: Zhang, Chenyu, et al.
Publicado: (2024)
por: Zhang, Chenyu, et al.
Publicado: (2024)
On Penalty-based Bilevel Gradient Descent Method
por: Shen, Han, et al.
Publicado: (2023)
por: Shen, Han, et al.
Publicado: (2023)
Sobolev Gradient Ascent for Optimal Transport: Barycenter Optimization and Convergence Analysis
por: Kim, Kaheon, et al.
Publicado: (2025)
por: Kim, Kaheon, et al.
Publicado: (2025)
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction
por: Feng, Jie, et al.
Publicado: (2024)
por: Feng, Jie, et al.
Publicado: (2024)
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
por: Ding, Dongsheng, et al.
Publicado: (2023)
por: Ding, Dongsheng, et al.
Publicado: (2023)
CORL: Reinforcement Learning of MILP Policies Solved via Branch and Bound
por: Anand, Akhil S, et al.
Publicado: (2025)
por: Anand, Akhil S, et al.
Publicado: (2025)
Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning
por: Zeng, Sihan, et al.
Publicado: (2024)
por: Zeng, Sihan, et al.
Publicado: (2024)
Gradient-Variation Online Learning under Generalized Smoothness
por: Xie, Yan-Feng, et al.
Publicado: (2024)
por: Xie, Yan-Feng, et al.
Publicado: (2024)
Optimal Parameter Adaptation for Safety-Critical Control via Safe Barrier Bayesian Optimization
por: Wang, Shengbo, et al.
Publicado: (2025)
por: Wang, Shengbo, et al.
Publicado: (2025)
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Stochastic Approach
por: Fernando, Heshan, et al.
Publicado: (2022)
por: Fernando, Heshan, et al.
Publicado: (2022)
Ejemplares similares
-
Performative Policy Gradient: Optimality in Performative Reinforcement Learning
por: Basu, Debabrota, et al.
Publicado: (2025) -
Achieve Performatively Optimal Policy for Performative Reinforcement Learning
por: Chen, Ziyi, et al.
Publicado: (2025) -
Bayesian Design Principles for Frequentist Sequential Learning
por: Xu, Yunbei, et al.
Publicado: (2023) -
Scalable Bi-causal Optimal Transport via KL Relaxation and Policy Gradients
por: Cao, Haoyang, et al.
Publicado: (2026) -
Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods
por: Carmona, René, et al.
Publicado: (2019)