Saved in:
| Main Authors: | Mu, Ni, Hu, Hao, Hu, Xiao, Yang, Yiqin, Xu, Bo, Jia, Qing-Shan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.00388 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
by: Luan, Yao, et al.
Published: (2025)
by: Luan, Yao, et al.
Published: (2025)
Preference-based Multi-Objective Reinforcement Learning
by: Mu, Ni, et al.
Published: (2025)
by: Mu, Ni, et al.
Published: (2025)
Query-Policy Misalignment in Preference-Based Reinforcement Learning
by: Hu, Xiao, et al.
Published: (2023)
by: Hu, Xiao, et al.
Published: (2023)
COLLIE: Guiding Skill Discovery in Semantically Coherent Latent Space
by: Luan, Yao, et al.
Published: (2026)
by: Luan, Yao, et al.
Published: (2026)
OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration
by: Yang, Yiqin, et al.
Published: (2026)
by: Yang, Yiqin, et al.
Published: (2026)
S-EPOA: Overcoming the Indistinguishability of Segments with Skill-Driven Preference-Based Reinforcement Learning
by: Mu, Ni, et al.
Published: (2024)
by: Mu, Ni, et al.
Published: (2024)
SC2Arena and StarEvolve: Benchmark and Self-Improvement Framework for LLMs in Complex Decision-Making Tasks
by: Shen, Pengbo, et al.
Published: (2025)
by: Shen, Pengbo, et al.
Published: (2025)
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
by: Yang, Yiqin, et al.
Published: (2025)
by: Yang, Yiqin, et al.
Published: (2025)
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
by: Hu, Hao, et al.
Published: (2024)
by: Hu, Hao, et al.
Published: (2024)
Reinforcement Learning from Diverse Human Preferences
by: Xue, Wanqi, et al.
Published: (2023)
by: Xue, Wanqi, et al.
Published: (2023)
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
by: Gao, Chen-Xiao, et al.
Published: (2024)
by: Gao, Chen-Xiao, et al.
Published: (2024)
LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency
by: Liu, Xiao-Yin, et al.
Published: (2024)
by: Liu, Xiao-Yin, et al.
Published: (2024)
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning
by: Zha, Yantian, et al.
Published: (2021)
by: Zha, Yantian, et al.
Published: (2021)
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
by: Liu, Ziang, et al.
Published: (2024)
by: Liu, Ziang, et al.
Published: (2024)
From Representation to Clusters: A Contrastive Learning Approach for Attributed Hypergraph Clustering
by: Ni, Li, et al.
Published: (2026)
by: Ni, Li, et al.
Published: (2026)
MrCoM: A Meta-Regularized World-Model Generalizing Across Multi-Scenarios
by: Xiong, Xuantang, et al.
Published: (2025)
by: Xiong, Xuantang, et al.
Published: (2025)
Active Query Synthesis for Preference Learning
by: Nadagouda, Namrata, et al.
Published: (2026)
by: Nadagouda, Namrata, et al.
Published: (2026)
Learning from Ambiguous Data with Hard Labels
by: Xie, Zeke, et al.
Published: (2025)
by: Xie, Zeke, et al.
Published: (2025)
DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration
by: Li, Xiyun, et al.
Published: (2025)
by: Li, Xiyun, et al.
Published: (2025)
Max-Entropy Reinforcement Learning with Flow Matching and A Case Study on LQR
by: Zhang, Yuyang, et al.
Published: (2025)
by: Zhang, Yuyang, et al.
Published: (2025)
Automata Learning from Preference and Equivalence Queries
by: Hsiung, Eric, et al.
Published: (2023)
by: Hsiung, Eric, et al.
Published: (2023)
CausalGDP: Causality-Guided Diffusion Policies for Reinforcement Learning
by: Xiao, Xiaofeng, et al.
Published: (2026)
by: Xiao, Xiaofeng, et al.
Published: (2026)
POLO: Preference-Guided Multi-Turn Reinforcement Learning for Lead Optimization
by: Wang, Ziqing, et al.
Published: (2025)
by: Wang, Ziqing, et al.
Published: (2025)
Dual-perspective Cross Contrastive Learning in Graph Transformers
by: Yao, Zelin, et al.
Published: (2024)
by: Yao, Zelin, et al.
Published: (2024)
Preference-Guided Reinforcement Learning for Efficient Exploration
by: Wang, Guojian, et al.
Published: (2024)
by: Wang, Guojian, et al.
Published: (2024)
Efficient Availability Attacks against Supervised and Contrastive Learning Simultaneously
by: Wang, Yihan, et al.
Published: (2024)
by: Wang, Yihan, et al.
Published: (2024)
Group-Agent Reinforcement Learning with Heterogeneous Agents
by: Wu, Kaiyue, et al.
Published: (2025)
by: Wu, Kaiyue, et al.
Published: (2025)
PCM-SAR: Physics-Driven Contrastive Mutual Learning for SAR Classification
by: Wang, Pengfei, et al.
Published: (2025)
by: Wang, Pengfei, et al.
Published: (2025)
Representation Learning Enhanced Deep Reinforcement Learning for Optimal Operation of Hydrogen-based Multi-Energy Systems
by: Pu, Zhenyu, et al.
Published: (2026)
by: Pu, Zhenyu, et al.
Published: (2026)
Ambiguous Online Learning
by: Kosoy, Vanessa
Published: (2025)
by: Kosoy, Vanessa
Published: (2025)
DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning
by: Hu, Xuemin, et al.
Published: (2024)
by: Hu, Xuemin, et al.
Published: (2024)
On Learning for Ambiguous Chance Constrained Problems
by: Madhusudanarao, A Ch, et al.
Published: (2023)
by: Madhusudanarao, A Ch, et al.
Published: (2023)
SASA: Semantic-Aware Contrastive Learning Framework with Separated Attention for Triple Classification
by: Xiaodan, Xu, et al.
Published: (2026)
by: Xiaodan, Xu, et al.
Published: (2026)
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
by: Bai, Fengshuo, et al.
Published: (2024)
by: Bai, Fengshuo, et al.
Published: (2024)
Contrastive Entropy Bounds for Density and Conditional Density Decomposition
by: Hu, Bo, et al.
Published: (2025)
by: Hu, Bo, et al.
Published: (2025)
Path-Coupled Bellman Flows for Distributional Reinforcement Learning
by: Xu, Boyang, et al.
Published: (2026)
by: Xu, Boyang, et al.
Published: (2026)
Episodic Novelty Through Temporal Distance
by: Jiang, Yuhua, et al.
Published: (2025)
by: Jiang, Yuhua, et al.
Published: (2025)
SHAP-Guided Kernel Actor-Critic for Explainable Reinforcement Learning
by: Li, Na, et al.
Published: (2025)
by: Li, Na, et al.
Published: (2025)
Towards Robust Incremental Learning under Ambiguous Supervision
by: Wang, Rui, et al.
Published: (2025)
by: Wang, Rui, et al.
Published: (2025)
Latent-Space Contrastive Reinforcement Learning for Stable and Efficient LLM Reasoning
by: Shan, Lianlei, et al.
Published: (2026)
by: Shan, Lianlei, et al.
Published: (2026)
Similar Items
-
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
by: Luan, Yao, et al.
Published: (2025) -
Preference-based Multi-Objective Reinforcement Learning
by: Mu, Ni, et al.
Published: (2025) -
Query-Policy Misalignment in Preference-Based Reinforcement Learning
by: Hu, Xiao, et al.
Published: (2023) -
COLLIE: Guiding Skill Discovery in Semantically Coherent Latent Space
by: Luan, Yao, et al.
Published: (2026) -
OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration
by: Yang, Yiqin, et al.
Published: (2026)