Saved in:
| Main Authors: | Sukhija, Bhavya, Treven, Lenart, Sferrazza, Carmelo, Dörfler, Florian, Abbeel, Pieter, Krause, Andreas |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.20066 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Sample-efficient and Scalable Exploration in Continuous-Time RL
by: Iten, Klemens, et al.
Published: (2025)
by: Iten, Klemens, et al.
Published: (2025)
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
by: Treven, Lenart, et al.
Published: (2024)
by: Treven, Lenart, et al.
Published: (2024)
NeoRL: Efficient Exploration for Nonepisodic RL
by: Sukhija, Bhavya, et al.
Published: (2024)
by: Sukhija, Bhavya, et al.
Published: (2024)
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
by: As, Yarden, et al.
Published: (2024)
by: As, Yarden, et al.
Published: (2024)
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
by: Sukhija, Bhavya, et al.
Published: (2024)
by: Sukhija, Bhavya, et al.
Published: (2024)
Bridging the Sim-to-Real Gap with Bayesian Inference
by: Rothfuss, Jonas, et al.
Published: (2024)
by: Rothfuss, Jonas, et al.
Published: (2024)
Simulation Priors for Data-Efficient Deep Learning
by: Treven, Lenart, et al.
Published: (2025)
by: Treven, Lenart, et al.
Published: (2025)
Active Few-Shot Fine-Tuning
by: Hübotter, Jonas, et al.
Published: (2024)
by: Hübotter, Jonas, et al.
Published: (2024)
Transductive Active Learning: Theory and Applications
by: Hübotter, Jonas, et al.
Published: (2024)
by: Hübotter, Jonas, et al.
Published: (2024)
Model-Based Reinforcement Learning for Control under Time-Varying Dynamics
by: Iten, Klemens, et al.
Published: (2026)
by: Iten, Klemens, et al.
Published: (2026)
TARC: Time-Adaptive Robotic Control
by: Sukhija, Arnav, et al.
Published: (2025)
by: Sukhija, Arnav, et al.
Published: (2025)
Optimistic Online LQR via Intrinsic Rewards
by: Bartos, Marcell, et al.
Published: (2026)
by: Bartos, Marcell, et al.
Published: (2026)
Safe Exploration Using Bayesian World Models and Log-Barrier Optimization
by: As, Yarden, et al.
Published: (2024)
by: As, Yarden, et al.
Published: (2024)
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
by: Nauman, Michal, et al.
Published: (2025)
by: Nauman, Michal, et al.
Published: (2025)
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation
by: Sferrazza, Carmelo, et al.
Published: (2024)
by: Sferrazza, Carmelo, et al.
Published: (2024)
Body Transformer: Leveraging Robot Embodiment for Policy Learning
by: Sferrazza, Carmelo, et al.
Published: (2024)
by: Sferrazza, Carmelo, et al.
Published: (2024)
Learning Sim-to-Real Humanoid Locomotion in 15 Minutes
by: Seo, Younggyo, et al.
Published: (2025)
by: Seo, Younggyo, et al.
Published: (2025)
FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control
by: Seo, Younggyo, et al.
Published: (2025)
by: Seo, Younggyo, et al.
Published: (2025)
Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning
by: Bhardwaj, Arjun, et al.
Published: (2023)
by: Bhardwaj, Arjun, et al.
Published: (2023)
Optimistic Task Inference for Behavior Foundation Models
by: Rupf, Thomas, et al.
Published: (2025)
by: Rupf, Thomas, et al.
Published: (2025)
OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction
by: Yang, Lujie, et al.
Published: (2025)
by: Yang, Lujie, et al.
Published: (2025)
Value-Based Deep RL Scales Predictably
by: Rybkin, Oleh, et al.
Published: (2025)
by: Rybkin, Oleh, et al.
Published: (2025)
Compute-Optimal Scaling for Value-Based Deep RL
by: Fu, Preston, et al.
Published: (2025)
by: Fu, Preston, et al.
Published: (2025)
Cliqueformer: Model-Based Optimization with Structured Transformers
by: Kuba, Jakub Grudzien, et al.
Published: (2024)
by: Kuba, Jakub Grudzien, et al.
Published: (2024)
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
by: Wang, Xiyao, et al.
Published: (2023)
by: Wang, Xiyao, et al.
Published: (2023)
Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
by: Wu, Zhen, et al.
Published: (2026)
by: Wu, Zhen, et al.
Published: (2026)
Residual Off-Policy RL for Finetuning Behavior Cloning Policies
by: Ankile, Lars, et al.
Published: (2025)
by: Ankile, Lars, et al.
Published: (2025)
End-to-end RL Improves Dexterous Grasping Policies
by: Singh, Ritvik, et al.
Published: (2025)
by: Singh, Ritvik, et al.
Published: (2025)
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
by: Huang, Jiawei, et al.
Published: (2024)
by: Huang, Jiawei, et al.
Published: (2024)
Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Reinforcement Learning
by: Seo, Younggyo, et al.
Published: (2024)
by: Seo, Younggyo, et al.
Published: (2024)
DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing
by: Lee, Vint, et al.
Published: (2023)
by: Lee, Vint, et al.
Published: (2023)
A Stable Whitening Optimizer for Efficient Neural Network Training
by: Frans, Kevin, et al.
Published: (2025)
by: Frans, Kevin, et al.
Published: (2025)
What Really Matters in Matrix-Whitening Optimizers?
by: Frans, Kevin, et al.
Published: (2025)
by: Frans, Kevin, et al.
Published: (2025)
SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
by: Lee, Jongmin, et al.
Published: (2025)
by: Lee, Jongmin, et al.
Published: (2025)
Reward-Conditioned Reinforcement Learning
by: Nauman, Michal, et al.
Published: (2026)
by: Nauman, Michal, et al.
Published: (2026)
Offline Imitation Learning Through Graph Search and Retrieval
by: Yin, Zhao-Heng, et al.
Published: (2024)
by: Yin, Zhao-Heng, et al.
Published: (2024)
World Model on Million-Length Video And Language With Blockwise RingAttention
by: Liu, Hao, et al.
Published: (2024)
by: Liu, Hao, et al.
Published: (2024)
Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
by: Bal, Melis Ilayda, et al.
Published: (2024)
by: Bal, Melis Ilayda, et al.
Published: (2024)
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
by: Psenka, Michael, et al.
Published: (2023)
by: Psenka, Michael, et al.
Published: (2023)
Scalable Diffusion for Materials Generation
by: Yang, Sherry, et al.
Published: (2023)
by: Yang, Sherry, et al.
Published: (2023)
Similar Items
-
Sample-efficient and Scalable Exploration in Continuous-Time RL
by: Iten, Klemens, et al.
Published: (2025) -
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
by: Treven, Lenart, et al.
Published: (2024) -
NeoRL: Efficient Exploration for Nonepisodic RL
by: Sukhija, Bhavya, et al.
Published: (2024) -
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
by: As, Yarden, et al.
Published: (2024) -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
by: Sukhija, Bhavya, et al.
Published: (2024)