:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Futuhi, Ehsan, Karimi, Shayan, Gao, Chao, Müller, Martin
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Robotics
Online Access:	https://arxiv.org/abs/2410.05225
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
by: Chenevert, Gabriel, et al.
Published: (2024)

Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation
by: Patel, Bhrij, et al.
Published: (2023)

Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
by: Jain, Ayush, et al.
Published: (2024)

Learning Admissible Heuristics for A*: Theory and Practice
by: Futuhi, Ehsan, et al.
Published: (2025)

Neural Style Transfer with Twin-Delayed DDPG for Shared Control of Robotic Manipulators
by: Fernandez-Fernandez, Raul, et al.
Published: (2024)

Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies
by: Christmann, Guilherme, et al.
Published: (2024)

Revisiting Sparse Rewards for Goal-Reaching Reinforcement Learning
by: Vasan, Gautham, et al.
Published: (2024)

Optimisation of Structured Neural Controller Based on Continuous-Time Policy Gradient
by: Cho, Namhoon, et al.
Published: (2022)

Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
by: Wang, Yixiao, et al.
Published: (2024)

Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning
by: Kanazawa, Takuya, et al.
Published: (2023)

Mollification Effects of Policy Gradient Methods
by: Wang, Tao, et al.
Published: (2024)

Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
by: Liu, Jijia, et al.
Published: (2025)

Flow Matching Policy Gradients
by: McAllister, David, et al.
Published: (2025)

Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery
by: Karimi, Zohre, et al.
Published: (2024)

Confounding Robust Continuous Control via Automatic Reward Shaping
by: Juliani, Mateo, et al.
Published: (2026)

Adaptive Teaching in Heterogeneous Agents: Balancing Surprise in Sparse Reward Scenarios
by: Clark, Emma, et al.
Published: (2024)

TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments
by: Hossain, Jumman, et al.
Published: (2024)

A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation
by: Liu, Xinjie, et al.
Published: (2025)

Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms
by: Schoepp, Sheila, et al.
Published: (2024)

Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning
by: Yunis, David, et al.
Published: (2023)

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
by: Diaz-Bone, Leander, et al.
Published: (2025)

Generalized Advantage Estimation for Distributional Policy Gradients
by: Shaik, Shahil, et al.
Published: (2025)

Scaling Algorithm Distillation for Continuous Control with Mamba
by: Beaussant, Samuel, et al.
Published: (2025)

ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
by: Zhang, Chen Bo Calvin, et al.
Published: (2024)

What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study
by: Chen, Jiayu, et al.
Published: (2024)

Enabling Option Learning in Sparse Rewards with Hindsight Experience Replay
by: Romio, Gabriel, et al.
Published: (2026)

Efficient Language-instructed Skill Acquisition via Reward-Policy Co-Evolution
by: Huang, Changxin, et al.
Published: (2024)

Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
by: Miranda, Victor R. F., et al.
Published: (2022)

A Review of Online Diffusion Policy RL Algorithms for Scalable Robotic Control
by: Choi, Wonhyeok, et al.
Published: (2026)

Does "Do Differentiable Simulators Give Better Policy Gradients?'' Give Better Policy Gradients?
by: Onoda, Ku, et al.
Published: (2026)

Policy Learning from Large Vision-Language Model Feedback without Reward Modeling
by: Luu, Tung M., et al.
Published: (2025)

Robot Policy Learning with Temporal Optimal Transport Reward
by: Fu, Yuwei, et al.
Published: (2024)

PolicyFlow: Policy Optimization with Continuous Normalizing Flow in Reinforcement Learning
by: Yang, Shunpeng, et al.
Published: (2026)

ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation
by: Tang, Nan, et al.
Published: (2025)

CaRL: Learning Scalable Planning Policies with Simple Rewards
by: Jaeger, Bernhard, et al.
Published: (2025)

Momentum Based Reward Design for Low Emission Traffic Signal Control
by: Mundane, Chinmay, et al.
Published: (2026)

Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
by: Lei, Kun, et al.
Published: (2023)

Optimal Control-Based Baseline for Guided Exploration in Policy Gradient Methods
by: Lyu, Xubo, et al.
Published: (2020)

Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms
by: Zhou, Zehao
Published: (2024)

Drifting Field Policy: A One-Step Generative Policy via Wasserstein Gradient Flow
by: Koo, Juil, et al.
Published: (2026)