Saved in:
| Main Authors: | Futuhi, Ehsan, Karimi, Shayan, Gao, Chao, Müller, Martin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.05225 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
by: Chenevert, Gabriel, et al.
Published: (2024)
by: Chenevert, Gabriel, et al.
Published: (2024)
Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation
by: Patel, Bhrij, et al.
Published: (2023)
by: Patel, Bhrij, et al.
Published: (2023)
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
by: Jain, Ayush, et al.
Published: (2024)
by: Jain, Ayush, et al.
Published: (2024)
Learning Admissible Heuristics for A*: Theory and Practice
by: Futuhi, Ehsan, et al.
Published: (2025)
by: Futuhi, Ehsan, et al.
Published: (2025)
Neural Style Transfer with Twin-Delayed DDPG for Shared Control of Robotic Manipulators
by: Fernandez-Fernandez, Raul, et al.
Published: (2024)
by: Fernandez-Fernandez, Raul, et al.
Published: (2024)
Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies
by: Christmann, Guilherme, et al.
Published: (2024)
by: Christmann, Guilherme, et al.
Published: (2024)
Revisiting Sparse Rewards for Goal-Reaching Reinforcement Learning
by: Vasan, Gautham, et al.
Published: (2024)
by: Vasan, Gautham, et al.
Published: (2024)
Optimisation of Structured Neural Controller Based on Continuous-Time Policy Gradient
by: Cho, Namhoon, et al.
Published: (2022)
by: Cho, Namhoon, et al.
Published: (2022)
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
by: Wang, Yixiao, et al.
Published: (2024)
by: Wang, Yixiao, et al.
Published: (2024)
Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning
by: Kanazawa, Takuya, et al.
Published: (2023)
by: Kanazawa, Takuya, et al.
Published: (2023)
Mollification Effects of Policy Gradient Methods
by: Wang, Tao, et al.
Published: (2024)
by: Wang, Tao, et al.
Published: (2024)
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
by: Liu, Jijia, et al.
Published: (2025)
by: Liu, Jijia, et al.
Published: (2025)
Flow Matching Policy Gradients
by: McAllister, David, et al.
Published: (2025)
by: McAllister, David, et al.
Published: (2025)
Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery
by: Karimi, Zohre, et al.
Published: (2024)
by: Karimi, Zohre, et al.
Published: (2024)
Confounding Robust Continuous Control via Automatic Reward Shaping
by: Juliani, Mateo, et al.
Published: (2026)
by: Juliani, Mateo, et al.
Published: (2026)
Adaptive Teaching in Heterogeneous Agents: Balancing Surprise in Sparse Reward Scenarios
by: Clark, Emma, et al.
Published: (2024)
by: Clark, Emma, et al.
Published: (2024)
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments
by: Hossain, Jumman, et al.
Published: (2024)
by: Hossain, Jumman, et al.
Published: (2024)
A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation
by: Liu, Xinjie, et al.
Published: (2025)
by: Liu, Xinjie, et al.
Published: (2025)
Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms
by: Schoepp, Sheila, et al.
Published: (2024)
by: Schoepp, Sheila, et al.
Published: (2024)
Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning
by: Yunis, David, et al.
Published: (2023)
by: Yunis, David, et al.
Published: (2023)
DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
by: Diaz-Bone, Leander, et al.
Published: (2025)
by: Diaz-Bone, Leander, et al.
Published: (2025)
Generalized Advantage Estimation for Distributional Policy Gradients
by: Shaik, Shahil, et al.
Published: (2025)
by: Shaik, Shahil, et al.
Published: (2025)
Scaling Algorithm Distillation for Continuous Control with Mamba
by: Beaussant, Samuel, et al.
Published: (2025)
by: Beaussant, Samuel, et al.
Published: (2025)
ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
by: Zhang, Chen Bo Calvin, et al.
Published: (2024)
by: Zhang, Chen Bo Calvin, et al.
Published: (2024)
What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study
by: Chen, Jiayu, et al.
Published: (2024)
by: Chen, Jiayu, et al.
Published: (2024)
Enabling Option Learning in Sparse Rewards with Hindsight Experience Replay
by: Romio, Gabriel, et al.
Published: (2026)
by: Romio, Gabriel, et al.
Published: (2026)
Efficient Language-instructed Skill Acquisition via Reward-Policy Co-Evolution
by: Huang, Changxin, et al.
Published: (2024)
by: Huang, Changxin, et al.
Published: (2024)
Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
by: Miranda, Victor R. F., et al.
Published: (2022)
by: Miranda, Victor R. F., et al.
Published: (2022)
A Review of Online Diffusion Policy RL Algorithms for Scalable Robotic Control
by: Choi, Wonhyeok, et al.
Published: (2026)
by: Choi, Wonhyeok, et al.
Published: (2026)
Does "Do Differentiable Simulators Give Better Policy Gradients?'' Give Better Policy Gradients?
by: Onoda, Ku, et al.
Published: (2026)
by: Onoda, Ku, et al.
Published: (2026)
Policy Learning from Large Vision-Language Model Feedback without Reward Modeling
by: Luu, Tung M., et al.
Published: (2025)
by: Luu, Tung M., et al.
Published: (2025)
Robot Policy Learning with Temporal Optimal Transport Reward
by: Fu, Yuwei, et al.
Published: (2024)
by: Fu, Yuwei, et al.
Published: (2024)
PolicyFlow: Policy Optimization with Continuous Normalizing Flow in Reinforcement Learning
by: Yang, Shunpeng, et al.
Published: (2026)
by: Yang, Shunpeng, et al.
Published: (2026)
ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation
by: Tang, Nan, et al.
Published: (2025)
by: Tang, Nan, et al.
Published: (2025)
CaRL: Learning Scalable Planning Policies with Simple Rewards
by: Jaeger, Bernhard, et al.
Published: (2025)
by: Jaeger, Bernhard, et al.
Published: (2025)
Momentum Based Reward Design for Low Emission Traffic Signal Control
by: Mundane, Chinmay, et al.
Published: (2026)
by: Mundane, Chinmay, et al.
Published: (2026)
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
by: Lei, Kun, et al.
Published: (2023)
by: Lei, Kun, et al.
Published: (2023)
Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods
by: Lyu, Xubo, et al.
Published: (2020)
by: Lyu, Xubo, et al.
Published: (2020)
Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms
by: Zhou, Zehao
Published: (2024)
by: Zhou, Zehao
Published: (2024)
Drifting Field Policy: A One-Step Generative Policy via Wasserstein Gradient Flow
by: Koo, Juil, et al.
Published: (2026)
by: Koo, Juil, et al.
Published: (2026)
Similar Items
-
Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
by: Chenevert, Gabriel, et al.
Published: (2024) -
Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation
by: Patel, Bhrij, et al.
Published: (2023) -
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
by: Jain, Ayush, et al.
Published: (2024) -
Learning Admissible Heuristics for A*: Theory and Practice
by: Futuhi, Ehsan, et al.
Published: (2025) -
Neural Style Transfer with Twin-Delayed DDPG for Shared Control of Robotic Manipulators
by: Fernandez-Fernandez, Raul, et al.
Published: (2024)