:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Allan, Finn, Chelsea, Harrison, James
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2402.05232
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TQL: Scaling Q-Functions with Transformers by Preventing Attention Collapse
by: Dong, Perry, et al.
Published: (2026)

Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
by: Hsu, Sheryl, et al.
Published: (2024)

Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning
by: Xie, Johnathan, et al.
Published: (2024)

EXPO: Stable Reinforcement Learning with Expressive Policies
by: Dong, Perry, et al.
Published: (2025)

FASTER: Value-Guided Sampling for Fast RL
by: Dong, Perry, et al.
Published: (2026)

Reinforcement Learning via Implicit Imitation Guidance
by: Dong, Perry, et al.
Published: (2025)

MemER: Scaling Up Memory for Robot Control via Experience Retrieval
by: Sridhar, Ajay, et al.
Published: (2025)

Learning Long-Context Diffusion Policies via Past-Token Prediction
by: Torne, Marcel, et al.
Published: (2025)

Value Flows
by: Dong, Perry, et al.
Published: (2025)

Curating Demonstrations using Online Experience
by: Chen, Annie S., et al.
Published: (2025)

Efficient Data Collection for Robotic Manipulation via Compositional Generalization
by: Gao, Jensen, et al.
Published: (2024)

Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning
by: Wagenmaker, Andrew, et al.
Published: (2025)

Polychromic Objectives for Reinforcement Learning
by: Hamid, Jubayer Ibn, et al.
Published: (2025)

Conservative Prediction via Data-Driven Confidence Minimization
by: Choi, Caroline, et al.
Published: (2023)

Affordance-Guided Reinforcement Learning via Visual Prompting
by: Lee, Olivia Y., et al.
Published: (2024)

Clarify: Improving Model Robustness With Natural Language Corrections
by: Lee, Yoonho, et al.
Published: (2024)

Calibrating Language Models with Adaptive Temperature Scaling
by: Xie, Johnathan, et al.
Published: (2024)

Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
by: Kim, Moo Jin, et al.
Published: (2025)

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
by: Putta, Pranav, et al.
Published: (2024)

Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning
by: Xiang, Violet, et al.
Published: (2025)

MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
by: Rafailov, Rafael, et al.
Published: (2024)

Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
by: Liu, Yuejiang, et al.
Published: (2024)

Contrastive Preference Learning: Learning from Human Feedback without RL
by: Hejna, Joey, et al.
Published: (2023)

A Critical Evaluation of AI Feedback for Aligning Large Language Models
by: Sharma, Archit, et al.
Published: (2024)

Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
by: Fu, Zipeng, et al.
Published: (2024)

Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
by: Mark, Max Sobol, et al.
Published: (2024)

Direct Preference Optimization: Your Language Model is Secretly a Reward Model
by: Rafailov, Rafael, et al.
Published: (2023)

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
by: Nakamoto, Mitsuhiko, et al.
Published: (2023)

RLVF: Learning from Verbal Feedback without Overgeneralization
by: Stephan, Moritz, et al.
Published: (2024)

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems
by: Qu, Yuxiao, et al.
Published: (2025)

Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models
by: Wu, Qi, et al.
Published: (2024)

Deriving Neural Scaling Laws from the statistics of natural language
by: Cagnetta, Francesco, et al.
Published: (2026)

Target-Aligned Reinforcement Learning
by: Pleiss, Leonard S., et al.
Published: (2026)

Yell At Your Robot: Improving On-the-Fly from Language Corrections
by: Shi, Lucy Xiaoyang, et al.
Published: (2024)

World Action Verifier: Self-Improving World Models via Forward-Inverse Asymmetry
by: Liu, Yuejiang, et al.
Published: (2026)

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
by: Rafailov, Rafael, et al.
Published: (2024)

HumanPlus: Humanoid Shadowing and Imitation from Humans
by: Fu, Zipeng, et al.
Published: (2024)

Neural Green's Functions
by: Yoo, Seungwoo, et al.
Published: (2025)

Universal Value-Function Uncertainties
by: Zanger, Moritz A., et al.
Published: (2025)

Towards Universal Neural Likelihood Inference
by: Brahmavar, Shreyas Bhat, et al.
Published: (2025)