Saved in:
| Main Authors: | Liu, Runze, Du, Yali, Bai, Fengshuo, Lyu, Jiafei, Li, Xiu |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2306.03615 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VLP: Vision-Language Preference Learning for Embodied Manipulation
by: Liu, Runze, et al.
Published: (2025)
by: Liu, Runze, et al.
Published: (2025)
PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
by: Sun, Shengjie, et al.
Published: (2025)
by: Sun, Shengjie, et al.
Published: (2025)
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
by: Sun, Shengjie, et al.
Published: (2024)
by: Sun, Shengjie, et al.
Published: (2024)
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
by: Bai, Fengshuo, et al.
Published: (2024)
by: Bai, Fengshuo, et al.
Published: (2024)
SEABO: A Simple Search-Based Method for Offline Imitation Learning
by: Lyu, Jiafei, et al.
Published: (2024)
by: Lyu, Jiafei, et al.
Published: (2024)
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
by: Lyu, Jiafei, et al.
Published: (2024)
by: Lyu, Jiafei, et al.
Published: (2024)
Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering
by: Qiao, Zhongjian, et al.
Published: (2025)
by: Qiao, Zhongjian, et al.
Published: (2025)
Temporal Difference Learning with Constrained Initial Representations
by: Lyu, Jiafei, et al.
Published: (2026)
by: Lyu, Jiafei, et al.
Published: (2026)
Unifying Value Alignment and Assignment in Cross-Domain Offline Reinforcement Learning with Heterogeneous Datasets
by: Qiao, Zhongjian, et al.
Published: (2026)
by: Qiao, Zhongjian, et al.
Published: (2026)
Dual-Robust Cross-Domain Offline Reinforcement Learning Against Dynamics Shifts
by: Qiao, Zhongjian, et al.
Published: (2025)
by: Qiao, Zhongjian, et al.
Published: (2025)
Mildly Conservative Q-Learning for Offline Reinforcement Learning
by: Lyu, Jiafei, et al.
Published: (2022)
by: Lyu, Jiafei, et al.
Published: (2022)
Mind the Model, Not the Agent: The Primacy Bias in Model-based RL
by: Qiao, Zhongjian, et al.
Published: (2023)
by: Qiao, Zhongjian, et al.
Published: (2023)
SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning
by: Qiao, Zhongjian, et al.
Published: (2024)
by: Qiao, Zhongjian, et al.
Published: (2024)
Understanding What Affects the Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence
by: Lyu, Jiafei, et al.
Published: (2024)
by: Lyu, Jiafei, et al.
Published: (2024)
A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation
by: Gong, Aicheng, et al.
Published: (2024)
by: Gong, Aicheng, et al.
Published: (2024)
Exploration and Anti-Exploration with Distributional Random Network Distillation
by: Yang, Kai, et al.
Published: (2024)
by: Yang, Kai, et al.
Published: (2024)
Cooperative Open-ended Learning Framework for Zero-shot Coordination
by: Li, Yang, et al.
Published: (2023)
by: Li, Yang, et al.
Published: (2023)
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
by: Liu, Zeyuan, et al.
Published: (2025)
by: Liu, Zeyuan, et al.
Published: (2025)
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
by: Rajaram, Sara, et al.
Published: (2025)
by: Rajaram, Sara, et al.
Published: (2025)
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics
by: Chen, Shirui, et al.
Published: (2026)
by: Chen, Shirui, et al.
Published: (2026)
Cross-Domain Offline Policy Adaptation via Selective Transition Correction
by: Yan, Mengbei, et al.
Published: (2026)
by: Yan, Mengbei, et al.
Published: (2026)
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
by: Zhang, Zhaowei, et al.
Published: (2025)
by: Zhang, Zhaowei, et al.
Published: (2025)
PEARL: Performance-Enhanced Aggregated Representation Learning
by: Li, Wenhui, et al.
Published: (2025)
by: Li, Wenhui, et al.
Published: (2025)
GEM: Generative Entropy-Guided Preference Modeling for Few-shot Alignment of LLMs
by: Zhao, Yiyang, et al.
Published: (2025)
by: Zhao, Yiyang, et al.
Published: (2025)
ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
by: Lyu, Jiafei, et al.
Published: (2024)
by: Lyu, Jiafei, et al.
Published: (2024)
Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting
by: Qiao, Zhongjian, et al.
Published: (2026)
by: Qiao, Zhongjian, et al.
Published: (2026)
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
by: Yang, Kai, et al.
Published: (2023)
by: Yang, Kai, et al.
Published: (2023)
PEARL: Training Socratic Tutors with Pedagogically Aligned Reinforcement Learning
by: Chang, Qikai, et al.
Published: (2026)
by: Chang, Qikai, et al.
Published: (2026)
Exploration by Random Distribution Distillation
by: Fang, Zhirui, et al.
Published: (2025)
by: Fang, Zhirui, et al.
Published: (2025)
ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs
by: Li, Yuhan, et al.
Published: (2024)
by: Li, Yuhan, et al.
Published: (2024)
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
by: Bai, Fengshuo, et al.
Published: (2024)
by: Bai, Fengshuo, et al.
Published: (2024)
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation
by: Pumacay, Wilbert, et al.
Published: (2024)
by: Pumacay, Wilbert, et al.
Published: (2024)
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
by: Zhang, Shenao, et al.
Published: (2024)
by: Zhang, Shenao, et al.
Published: (2024)
Adversarial Preference Learning for Robust LLM Alignment
by: Wang, Yuanfu, et al.
Published: (2025)
by: Wang, Yuanfu, et al.
Published: (2025)
ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation
by: Tang, Nan, et al.
Published: (2025)
by: Tang, Nan, et al.
Published: (2025)
Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning
by: Chen, Yangkun, et al.
Published: (2024)
by: Chen, Yangkun, et al.
Published: (2024)
Reward Machine Inference for Robotic Manipulation
by: Baert, Mattijs, et al.
Published: (2024)
by: Baert, Mattijs, et al.
Published: (2024)
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
by: Lou, Xingzhou, et al.
Published: (2024)
by: Lou, Xingzhou, et al.
Published: (2024)
ProcVLM: Learning Procedure-Grounded Progress Rewards for Robotic Manipulation
by: Feng, Youhe, et al.
Published: (2026)
by: Feng, Youhe, et al.
Published: (2026)
PEARL: Towards Permutation-Resilient LLMs
by: Chen, Liang, et al.
Published: (2025)
by: Chen, Liang, et al.
Published: (2025)
Similar Items
-
VLP: Vision-Language Preference Learning for Embodied Manipulation
by: Liu, Runze, et al.
Published: (2025) -
PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
by: Sun, Shengjie, et al.
Published: (2025) -
A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
by: Sun, Shengjie, et al.
Published: (2024) -
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
by: Bai, Fengshuo, et al.
Published: (2024) -
SEABO: A Simple Search-Based Method for Offline Imitation Learning
by: Lyu, Jiafei, et al.
Published: (2024)