:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sukhija, Bhavya, Treven, Lenart, Sferrazza, Carmelo, Dörfler, Florian, Abbeel, Pieter, Krause, Andreas
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2511.20066
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Sample-efficient and Scalable Exploration in Continuous-Time RL
by: Iten, Klemens, et al.
Published: (2025)

When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
by: Treven, Lenart, et al.
Published: (2024)

NeoRL: Efficient Exploration for Nonepisodic RL
by: Sukhija, Bhavya, et al.
Published: (2024)

ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
by: As, Yarden, et al.
Published: (2024)

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
by: Sukhija, Bhavya, et al.
Published: (2024)

Bridging the Sim-to-Real Gap with Bayesian Inference
by: Rothfuss, Jonas, et al.
Published: (2024)

Simulation Priors for Data-Efficient Deep Learning
by: Treven, Lenart, et al.
Published: (2025)

Active Few-Shot Fine-Tuning
by: Hübotter, Jonas, et al.
Published: (2024)

Transductive Active Learning: Theory and Applications
by: Hübotter, Jonas, et al.
Published: (2024)

Model-Based Reinforcement Learning for Control under Time-Varying Dynamics
by: Iten, Klemens, et al.
Published: (2026)

TARC: Time-Adaptive Robotic Control
by: Sukhija, Arnav, et al.
Published: (2025)

Optimistic Online LQR via Intrinsic Rewards
by: Bartos, Marcell, et al.
Published: (2026)

Safe Exploration Using Bayesian World Models and Log-Barrier Optimization
by: As, Yarden, et al.
Published: (2024)

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
by: Nauman, Michal, et al.
Published: (2025)

HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation
by: Sferrazza, Carmelo, et al.
Published: (2024)

Body Transformer: Leveraging Robot Embodiment for Policy Learning
by: Sferrazza, Carmelo, et al.
Published: (2024)

Learning Sim-to-Real Humanoid Locomotion in 15 Minutes
by: Seo, Younggyo, et al.
Published: (2025)

FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control
by: Seo, Younggyo, et al.
Published: (2025)

Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning
by: Bhardwaj, Arjun, et al.
Published: (2023)

Optimistic Task Inference for Behavior Foundation Models
by: Rupf, Thomas, et al.
Published: (2025)

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction
by: Yang, Lujie, et al.
Published: (2025)

Value-Based Deep RL Scales Predictably
by: Rybkin, Oleh, et al.
Published: (2025)

Compute-Optimal Scaling for Value-Based Deep RL
by: Fu, Preston, et al.
Published: (2025)

Cliqueformer: Model-Based Optimization with Structured Transformers
by: Kuba, Jakub Grudzien, et al.
Published: (2024)

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
by: Wang, Xiyao, et al.
Published: (2023)

Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
by: Wu, Zhen, et al.
Published: (2026)

Residual Off-Policy RL for Finetuning Behavior Cloning Policies
by: Ankile, Lars, et al.
Published: (2025)

End-to-end RL Improves Dexterous Grasping Policies
by: Singh, Ritvik, et al.
Published: (2025)

Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
by: Huang, Jiawei, et al.
Published: (2024)

Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Reinforcement Learning
by: Seo, Younggyo, et al.
Published: (2024)

DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing
by: Lee, Vint, et al.
Published: (2023)

A Stable Whitening Optimizer for Efficient Neural Network Training
by: Frans, Kevin, et al.
Published: (2025)

What Really Matters in Matrix-Whitening Optimizers?
by: Frans, Kevin, et al.
Published: (2025)

SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
by: Lee, Jongmin, et al.
Published: (2025)

Reward-Conditioned Reinforcement Learning
by: Nauman, Michal, et al.
Published: (2026)

Offline Imitation Learning Through Graph Search and Retrieval
by: Yin, Zhao-Heng, et al.
Published: (2024)

World Model on Million-Length Video And Language With Blockwise RingAttention
by: Liu, Hao, et al.
Published: (2024)

Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
by: Bal, Melis Ilayda, et al.
Published: (2024)

Learning a Diffusion Model Policy from Rewards via Q-Score Matching
by: Psenka, Michael, et al.
Published: (2023)

Scalable Diffusion for Materials Generation
by: Yang, Sherry, et al.
Published: (2023)