Saved in:
| Main Authors: | Wang, Tao, Herbert, Sylvia, Gao, Sicun |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.17832 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
by: Wang, Tao, et al.
Published: (2025)
by: Wang, Tao, et al.
Published: (2025)
Fractal Landscapes in Policy Optimization
by: Wang, Tao, et al.
Published: (2023)
by: Wang, Tao, et al.
Published: (2023)
Extremum-Seeking Action Selection for Accelerating Policy Optimization
by: Chang, Ya-Chien, et al.
Published: (2024)
by: Chang, Ya-Chien, et al.
Published: (2024)
Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey
by: Ganai, Milan, et al.
Published: (2024)
by: Ganai, Milan, et al.
Published: (2024)
Value Functions for Temporal Logic: Optimal Policies and Safety Filters
by: So, Oswin, et al.
Published: (2026)
by: So, Oswin, et al.
Published: (2026)
Does "Do Differentiable Simulators Give Better Policy Gradients?'' Give Better Policy Gradients?
by: Onoda, Ku, et al.
Published: (2026)
by: Onoda, Ku, et al.
Published: (2026)
Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning
by: Shitanda, Naoki, et al.
Published: (2026)
by: Shitanda, Naoki, et al.
Published: (2026)
Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods
by: Lyu, Xubo, et al.
Published: (2020)
by: Lyu, Xubo, et al.
Published: (2020)
Drifting Field Policy: A One-Step Generative Policy via Wasserstein Gradient Flow
by: Koo, Juil, et al.
Published: (2026)
by: Koo, Juil, et al.
Published: (2026)
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
by: Jain, Ayush, et al.
Published: (2024)
by: Jain, Ayush, et al.
Published: (2024)
Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers
by: Vasan, Gautham, et al.
Published: (2024)
by: Vasan, Gautham, et al.
Published: (2024)
A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation
by: Liu, Xinjie, et al.
Published: (2025)
by: Liu, Xinjie, et al.
Published: (2025)
When Maximum Entropy Misleads Policy Optimization
by: Zhang, Ruipeng, et al.
Published: (2025)
by: Zhang, Ruipeng, et al.
Published: (2025)
Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms
by: Schoepp, Sheila, et al.
Published: (2024)
by: Schoepp, Sheila, et al.
Published: (2024)
ImaginationPolicy: Towards Generalizable, Precise and Reliable End-to-End Policy for Robotic Manipulation
by: Lu, Dekun, et al.
Published: (2025)
by: Lu, Dekun, et al.
Published: (2025)
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
by: Yuan, Xiu, et al.
Published: (2024)
by: Yuan, Xiu, et al.
Published: (2024)
Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression
by: Acero, Fernando, et al.
Published: (2024)
by: Acero, Fernando, et al.
Published: (2024)
RN-D: Discretized Categorical Actors with Regularized Networks for On-Policy Reinforcement Learning
by: Bian, Yuexin, et al.
Published: (2026)
by: Bian, Yuexin, et al.
Published: (2026)
Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects
by: Mosbach, Malte, et al.
Published: (2024)
by: Mosbach, Malte, et al.
Published: (2024)
Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle Coordination by Multi-Critic Policy Gradient Optimization
by: Alon, Yoav, et al.
Published: (2020)
by: Alon, Yoav, et al.
Published: (2020)
Activation-Descent Regularization for Input Optimization of ReLU Networks
by: Yu, Hongzhan, et al.
Published: (2024)
by: Yu, Hongzhan, et al.
Published: (2024)
A Taxonomy for Evaluating Generalist Robot Manipulation Policies
by: Gao, Jensen, et al.
Published: (2025)
by: Gao, Jensen, et al.
Published: (2025)
Evolutionary Policy Optimization
by: Wang, Jianren, et al.
Published: (2025)
by: Wang, Jianren, et al.
Published: (2025)
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
by: Liu, Tenglong, et al.
Published: (2024)
by: Liu, Tenglong, et al.
Published: (2024)
Imagination Policy: Using Generative Point Cloud Models for Learning Manipulation Policies
by: Huang, Haojie, et al.
Published: (2024)
by: Huang, Haojie, et al.
Published: (2024)
Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos
by: Ye, Weirui, et al.
Published: (2025)
by: Ye, Weirui, et al.
Published: (2025)
Coordinated Humanoid Manipulation with Choice Policies
by: Qi, Haozhi, et al.
Published: (2025)
by: Qi, Haozhi, et al.
Published: (2025)
Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
by: Wang, Yiqi, et al.
Published: (2025)
by: Wang, Yiqi, et al.
Published: (2025)
Policy-Guided Diffusion
by: Jackson, Matthew Thomas, et al.
Published: (2024)
by: Jackson, Matthew Thomas, et al.
Published: (2024)
Absolute Policy Optimization
by: Zhao, Weiye, et al.
Published: (2023)
by: Zhao, Weiye, et al.
Published: (2023)
WarmPrior: Straightening Flow-Matching Policies with Temporal Priors
by: Kang, Sinjae, et al.
Published: (2026)
by: Kang, Sinjae, et al.
Published: (2026)
Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations
by: Mitsides, Konstantinos, et al.
Published: (2025)
by: Mitsides, Konstantinos, et al.
Published: (2025)
Leveraging Analytic Gradients in Provably Safe Reinforcement Learning
by: Walter, Tim, et al.
Published: (2025)
by: Walter, Tim, et al.
Published: (2025)
Foundation Policies with Hilbert Representations
by: Park, Seohong, et al.
Published: (2024)
by: Park, Seohong, et al.
Published: (2024)
Exploiting Hybrid Policy in Reinforcement Learning for Interpretable Temporal Logic Manipulation
by: Zhang, Hao, et al.
Published: (2024)
by: Zhang, Hao, et al.
Published: (2024)
Safe Deep Policy Adaptation
by: Xiao, Wenli, et al.
Published: (2023)
by: Xiao, Wenli, et al.
Published: (2023)
Flattening Hierarchies with Policy Bootstrapping
by: Zhou, John L., et al.
Published: (2025)
by: Zhou, John L., et al.
Published: (2025)
Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation
by: Xue, Han, et al.
Published: (2025)
by: Xue, Han, et al.
Published: (2025)
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
by: Jiang, Yunfan, et al.
Published: (2024)
by: Jiang, Yunfan, et al.
Published: (2024)
RoboPocket: Improve Robot Policies Instantly with Your Phone
by: Fang, Junjie, et al.
Published: (2026)
by: Fang, Junjie, et al.
Published: (2026)
Similar Items
-
Improving Value Estimation Critically Enhances Vanilla Policy Gradient
by: Wang, Tao, et al.
Published: (2025) -
Fractal Landscapes in Policy Optimization
by: Wang, Tao, et al.
Published: (2023) -
Extremum-Seeking Action Selection for Accelerating Policy Optimization
by: Chang, Ya-Chien, et al.
Published: (2024) -
Hamilton-Jacobi Reachability in Reinforcement Learning: A Survey
by: Ganai, Milan, et al.
Published: (2024) -
Value Functions for Temporal Logic: Optimal Policies and Safety Filters
by: So, Oswin, et al.
Published: (2026)