Saved in:
| Main Authors: | DeWeese, Alex, Qu, Guannan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.10909 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Thinking Beyond Visibility: A Near-Optimal Policy Framework for Locally Interdependent Multi-Agent MDPs
by: DeWeese, Alex, et al.
Published: (2025)
by: DeWeese, Alex, et al.
Published: (2025)
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
by: DeWeese, Alex, et al.
Published: (2024)
by: DeWeese, Alex, et al.
Published: (2024)
A Theory of Saddle Escape in Deep Nonlinear Networks
by: Rawal, Divit, et al.
Published: (2026)
by: Rawal, Divit, et al.
Published: (2026)
Policy Gradient with Tree Search: Avoiding Local Optimas through Lookahead
by: Koren, Uri, et al.
Published: (2025)
by: Koren, Uri, et al.
Published: (2025)
Non-Myopic Active Feature Acquisition via Pathwise Policy Gradients
by: Aronsson, Linus, et al.
Published: (2026)
by: Aronsson, Linus, et al.
Published: (2026)
Natural Policy Gradient for Average Reward Non-Stationary RL
by: Jali, Neharika, et al.
Published: (2025)
by: Jali, Neharika, et al.
Published: (2025)
Differentiating Policies for Non-Myopic Bayesian Optimization
by: Nwankwo, Darian, et al.
Published: (2024)
by: Nwankwo, Darian, et al.
Published: (2024)
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
by: Corrado, Nicholas E., et al.
Published: (2023)
by: Corrado, Nicholas E., et al.
Published: (2023)
Escaping Local Optima in Global Placement
by: Xue, Ke, et al.
Published: (2024)
by: Xue, Ke, et al.
Published: (2024)
Group Policy Gradient
by: Chen, Junhua, et al.
Published: (2025)
by: Chen, Junhua, et al.
Published: (2025)
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
by: Kunin, Daniel, et al.
Published: (2025)
by: Kunin, Daniel, et al.
Published: (2025)
When Do Off-Policy and On-Policy Policy Gradient Methods Align?
by: Mambelli, Davide, et al.
Published: (2024)
by: Mambelli, Davide, et al.
Published: (2024)
Learning General Policies with Policy Gradient Methods
by: Ståhlberg, Simon, et al.
Published: (2025)
by: Ståhlberg, Simon, et al.
Published: (2025)
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
by: Montenegro, Alessandro, et al.
Published: (2024)
by: Montenegro, Alessandro, et al.
Published: (2024)
Wasserstein Proximal Policy Gradient
by: Zhu, Zhaoyu, et al.
Published: (2026)
by: Zhu, Zhaoyu, et al.
Published: (2026)
Functional Natural Policy Gradients
by: Bibaut, Aurelien, et al.
Published: (2026)
by: Bibaut, Aurelien, et al.
Published: (2026)
Differentially Private Policy Gradient
by: Rio, Alexandre, et al.
Published: (2025)
by: Rio, Alexandre, et al.
Published: (2025)
Delightful Policy Gradient
by: Osband, Ian
Published: (2026)
by: Osband, Ian
Published: (2026)
Flow Matching Policy Gradients
by: McAllister, David, et al.
Published: (2025)
by: McAllister, David, et al.
Published: (2025)
A Minibatch-SGD-Based Learning Meta-Policy for Inventory Systems with Myopic Optimal Policy
by: Lyu, Jiameng, et al.
Published: (2024)
by: Lyu, Jiameng, et al.
Published: (2024)
On Building Myopic MPC Policies using Supervised Learning
by: Orrico, Christopher A., et al.
Published: (2024)
by: Orrico, Christopher A., et al.
Published: (2024)
Policy Gradient with Kernel Quadrature
by: Hayakawa, Satoshi, et al.
Published: (2023)
by: Hayakawa, Satoshi, et al.
Published: (2023)
On Quantum Natural Policy Gradients
by: Sequeira, André, et al.
Published: (2024)
by: Sequeira, André, et al.
Published: (2024)
Policy Gradient with Tree Expansion
by: Dalal, Gal, et al.
Published: (2023)
by: Dalal, Gal, et al.
Published: (2023)
Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
by: Karkada, Dhruva, et al.
Published: (2025)
by: Karkada, Dhruva, et al.
Published: (2025)
Identifying Policy Gradient Subspaces
by: Schneider, Jan, et al.
Published: (2024)
by: Schneider, Jan, et al.
Published: (2024)
Imitate Optimal Policy: Prevail and Induce Action Collapse in Policy Gradient
by: Zhou, Zhongzhu, et al.
Published: (2025)
by: Zhou, Zhongzhu, et al.
Published: (2025)
Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes
by: Montenegro, Alessandro, et al.
Published: (2025)
by: Montenegro, Alessandro, et al.
Published: (2025)
Behind the Myth of Exploration in Policy Gradients
by: Bolland, Adrien, et al.
Published: (2024)
by: Bolland, Adrien, et al.
Published: (2024)
GFlowNet Training by Policy Gradients
by: Niu, Puhua, et al.
Published: (2024)
by: Niu, Puhua, et al.
Published: (2024)
Policy Gradient with Active Importance Sampling
by: Papini, Matteo, et al.
Published: (2024)
by: Papini, Matteo, et al.
Published: (2024)
Does "Do Differentiable Simulators Give Better Policy Gradients?'' Give Better Policy Gradients?
by: Onoda, Ku, et al.
Published: (2026)
by: Onoda, Ku, et al.
Published: (2026)
Delightful Distributed Policy Gradient
by: Osband, Ian
Published: (2026)
by: Osband, Ian
Published: (2026)
GPG: Generalized Policy Gradient Theorem for Transformer-based Policies
by: Mao, Hangyu, et al.
Published: (2025)
by: Mao, Hangyu, et al.
Published: (2025)
Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients
by: Cundy, Chris, et al.
Published: (2020)
by: Cundy, Chris, et al.
Published: (2020)
Gradient Extrapolation-Based Policy Optimization
by: Swapnil, Ismam Nur, et al.
Published: (2026)
by: Swapnil, Ismam Nur, et al.
Published: (2026)
Partial Policy Gradients for RL in LLMs
by: Mathur, Puneet, et al.
Published: (2026)
by: Mathur, Puneet, et al.
Published: (2026)
Policy Gradient for LQR with Domain Randomization
by: Fujinami, Tesshu, et al.
Published: (2025)
by: Fujinami, Tesshu, et al.
Published: (2025)
Recurrent Natural Policy Gradient for POMDPs
by: Cayci, Semih, et al.
Published: (2024)
by: Cayci, Semih, et al.
Published: (2024)
Elementary Analysis of Policy Gradient Methods
by: Liu, Jiacai, et al.
Published: (2024)
by: Liu, Jiacai, et al.
Published: (2024)
Similar Items
-
Thinking Beyond Visibility: A Near-Optimal Policy Framework for Locally Interdependent Multi-Agent MDPs
by: DeWeese, Alex, et al.
Published: (2025) -
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
by: DeWeese, Alex, et al.
Published: (2024) -
A Theory of Saddle Escape in Deep Nonlinear Networks
by: Rawal, Divit, et al.
Published: (2026) -
Policy Gradient with Tree Search: Avoiding Local Optimas through Lookahead
by: Koren, Uri, et al.
Published: (2025) -
Non-Myopic Active Feature Acquisition via Pathwise Policy Gradients
by: Aronsson, Linus, et al.
Published: (2026)