:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lee, Yoonho, Boen, Joseph, Finn, Chelsea
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2511.07919
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning
by: Xie, Johnathan, et al.
Published: (2024)

Conservative Prediction via Data-Driven Confidence Minimization
by: Choi, Caroline, et al.
Published: (2023)

Clarify: Improving Model Robustness With Natural Language Corrections
by: Lee, Yoonho, et al.
Published: (2024)

Calibrating Language Models with Adaptive Temperature Scaling
by: Xie, Johnathan, et al.
Published: (2024)

Test-Time Alignment via Hypothesis Reweighting
by: Lee, Yoonho, et al.
Published: (2024)

Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
by: Liu, Yuejiang, et al.
Published: (2024)

AutoFT: Learning an Objective for Robust Fine-Tuning
by: Choi, Caroline, et al.
Published: (2024)

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems
by: Qu, Yuxiao, et al.
Published: (2025)

TextGrad: Automatic "Differentiation" via Text
by: Yuksekgonul, Mert, et al.
Published: (2024)

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models
by: Shi, Lucy Xiaoyang, et al.
Published: (2025)

Continuum-armed Bandit Optimization with Batch Pairwise Comparison Oracles
by: Chang, Xiangyu, et al.
Published: (2025)

Open-Ended Task Discovery via Bayesian Optimization
by: Adachi, Masaki, et al.
Published: (2026)

Pairwise Comparisons without Stochastic Transitivity: Model, Theory and Applications
by: Lee, Sze Ming, et al.
Published: (2025)

Disentangling Length from Quality in Direct Preference Optimization
by: Park, Ryan, et al.
Published: (2024)

Quantum Natural Stochastic Pairwise Coordinate Descent
by: Sohail, Mohammad Aamir, et al.
Published: (2024)

Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation
by: Arias, Esteban Garces, et al.
Published: (2024)

AutoLibra: Agent Metric Induction from Open-Ended Human Feedback
by: Zhu, Hao, et al.
Published: (2025)

Minimum Weighted Feedback Arc Sets for Ranking from Pairwise Comparisons
by: Vahidi, Soroush, et al.
Published: (2024)

Stability-based Generalization Analysis of Randomized Coordinate Descent for Pairwise Learning
by: Wu, Liang, et al.
Published: (2025)

Robust Agents in Open-Ended Worlds
by: Samvelyan, Mikayel
Published: (2025)

Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
by: Arias, Esteban Garces, et al.
Published: (2024)

A Critical Evaluation of AI Feedback for Aligning Large Language Models
by: Sharma, Archit, et al.
Published: (2024)

Affordance-Guided Reinforcement Learning via Visual Prompting
by: Lee, Olivia Y., et al.
Published: (2024)

Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses
by: Baral, Sami, et al.
Published: (2024)

Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
by: Kim, Moo Jin, et al.
Published: (2025)

Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons
by: Zhu, Banghua, et al.
Published: (2023)

Limited Memory Online Gradient Descent for Kernelized Pairwise Learning with Dynamic Averaging
by: AlQuabeh, Hilal, et al.
Published: (2024)

Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability
by: Prasad, Aaditya Vikram, et al.
Published: (2026)

Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation
by: Arias, Esteban Garces, et al.
Published: (2024)

Learning from Similarity/Dissimilarity and Pairwise Comparison
by: Tate, Tomoya, et al.
Published: (2026)

FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale
by: He, Runyuan, et al.
Published: (2026)

Learning Long-Context Diffusion Policies via Past-Token Prediction
by: Torne, Marcel, et al.
Published: (2025)

Reinforcement Learning via Implicit Imitation Guidance
by: Dong, Perry, et al.
Published: (2025)

RLVF: Learning from Verbal Feedback without Overgeneralization
by: Stephan, Moritz, et al.
Published: (2024)

Universal Neural Functionals
by: Zhou, Allan, et al.
Published: (2024)

MemER: Scaling Up Memory for Robot Control via Experience Retrieval
by: Sridhar, Ajay, et al.
Published: (2025)

Contrastive Preference Learning: Learning from Human Feedback without RL
by: Hejna, Joey, et al.
Published: (2023)

Score-Based Density Estimation from Pairwise Comparisons
by: Mikkola, Petrus, et al.
Published: (2025)

Metric Learning from Limited Pairwise Preference Comparisons
by: Wang, Zhi, et al.
Published: (2024)

Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
by: Lee, Joongkyu, et al.
Published: (2025)