Saved in:
| Main Authors: | Chen, Ying-Tu, Hung, Wei, Wu, Bing-Shu, Hong, Zhang-Wei, Hsieh, Ping-Chun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.24532 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots
by: Hung, Wei, et al.
Published: (2022)
by: Hung, Wei, et al.
Published: (2022)
From Reward-Free Representations to Preferences: Rethinking Offline Preference-Based Reinforcement Learning
by: Yang, Jun-Jie, et al.
Published: (2026)
by: Yang, Jun-Jie, et al.
Published: (2026)
Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs
by: Hung, Wei, et al.
Published: (2025)
by: Hung, Wei, et al.
Published: (2025)
Diffusion-Reward Adversarial Imitation Learning
by: Lai, Chun-Mao, et al.
Published: (2024)
by: Lai, Chun-Mao, et al.
Published: (2024)
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
by: Hung, Yu-Heng, et al.
Published: (2025)
by: Hung, Yu-Heng, et al.
Published: (2025)
Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee
by: Hung, Yu-Heng, et al.
Published: (2025)
by: Hung, Yu-Heng, et al.
Published: (2025)
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
by: Zhang, Qining, et al.
Published: (2024)
by: Zhang, Qining, et al.
Published: (2024)
Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning
by: Park, Giseung, et al.
Published: (2025)
by: Park, Giseung, et al.
Published: (2025)
Enhancing Offline Model-Based RL via Active Model Selection: A Bayesian Optimization Perspective
by: Yang, Yu-Wei, et al.
Published: (2025)
by: Yang, Yu-Wei, et al.
Published: (2025)
Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization
by: Bai, Yang, et al.
Published: (2026)
by: Bai, Yang, et al.
Published: (2026)
Action-Constrained Imitation Learning
by: Yeh, Chia-Han, et al.
Published: (2025)
by: Yeh, Chia-Han, et al.
Published: (2025)
Self-Reinforced Graph Contrastive Learning
by: Hsieh, Chou-Ying, et al.
Published: (2025)
by: Hsieh, Chou-Ying, et al.
Published: (2025)
Digital Twin-enabled Multi-generation Control Co-Design with Deep Reinforcement Learning
by: Tsai, Ying-Kuan, et al.
Published: (2025)
by: Tsai, Ying-Kuan, et al.
Published: (2025)
Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms
by: Aggarwal, Vaneet, et al.
Published: (2024)
by: Aggarwal, Vaneet, et al.
Published: (2024)
AIM: Adversarial Information Masking for Faithfulness Evaluation of Saliency Maps
by: Hsieh, Chia-Ying, et al.
Published: (2026)
by: Hsieh, Chia-Ying, et al.
Published: (2026)
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
by: Chen, Yen-Ju, et al.
Published: (2023)
by: Chen, Yen-Ju, et al.
Published: (2023)
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts
by: Wang, Haoxiang, et al.
Published: (2024)
by: Wang, Haoxiang, et al.
Published: (2024)
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
by: Chiang, Chia-Cheng, et al.
Published: (2024)
by: Chiang, Chia-Cheng, et al.
Published: (2024)
Beware Untrusted Simulators -- Reward-Free Backdoor Attacks in Reinforcement Learning
by: Rathbun, Ethan, et al.
Published: (2026)
by: Rathbun, Ethan, et al.
Published: (2026)
Reinforcement Learning with Non-Cumulative Objective
by: Cui, Wei, et al.
Published: (2023)
by: Cui, Wei, et al.
Published: (2023)
DexMan: Learning Bimanual Dexterous Manipulation from Human and Generated Videos
by: Hsieh, Jhen, et al.
Published: (2025)
by: Hsieh, Jhen, et al.
Published: (2025)
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
by: Zhang, Kaiyi, et al.
Published: (2026)
by: Zhang, Kaiyi, et al.
Published: (2026)
Pareto Set Learning for Multi-Objective Reinforcement Learning
by: Liu, Erlong, et al.
Published: (2025)
by: Liu, Erlong, et al.
Published: (2025)
Semi-Supervised Cross-Domain Imitation Learning
by: Chu, Li-Min, et al.
Published: (2026)
by: Chu, Li-Min, et al.
Published: (2026)
DFedReweighting: A Unified Framework for Objective-Oriented Reweighting in Decentralized Federated Learning
by: Zhang, Kaichuang, et al.
Published: (2025)
by: Zhang, Kaichuang, et al.
Published: (2025)
Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning
by: Kim, Dohyeong, et al.
Published: (2024)
by: Kim, Dohyeong, et al.
Published: (2024)
Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning
by: Chen, Sze-Ann, et al.
Published: (2026)
by: Chen, Sze-Ann, et al.
Published: (2026)
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
by: Liu, Max, et al.
Published: (2024)
by: Liu, Max, et al.
Published: (2024)
Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning
by: Bui, The Viet, et al.
Published: (2025)
by: Bui, The Viet, et al.
Published: (2025)
Focal Reward: Balanced Reinforcement Learning under Rubric-Based Rewards
by: Huang, Yu, et al.
Published: (2026)
by: Huang, Yu, et al.
Published: (2026)
MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation
by: Rachman, Rifny, et al.
Published: (2026)
by: Rachman, Rifny, et al.
Published: (2026)
RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
by: Zhang, Zijing, et al.
Published: (2025)
by: Zhang, Zijing, et al.
Published: (2025)
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
by: Lin, Baijiong, et al.
Published: (2025)
by: Lin, Baijiong, et al.
Published: (2025)
STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning
by: Shao, Wei, et al.
Published: (2024)
by: Shao, Wei, et al.
Published: (2024)
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Reward Design
by: Wei, Quan, et al.
Published: (2025)
by: Wei, Quan, et al.
Published: (2025)
Learning Pareto-Optimal Rewards from Noisy Preferences: A Framework for Multi-Objective Inverse Reinforcement Learning
by: Cherukuri, Kalyan, et al.
Published: (2025)
by: Cherukuri, Kalyan, et al.
Published: (2025)
Automatic Reward Shaping from Multi-Objective Human Heuristics
by: Xie, Yuqing, et al.
Published: (2025)
by: Xie, Yuqing, et al.
Published: (2025)
Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting
by: Lu, Yining, et al.
Published: (2025)
by: Lu, Yining, et al.
Published: (2025)
A Continuous Encoding-Based Representation for Efficient Multi-Fidelity Multi-Objective Neural Architecture Search
by: Wei, Zhao, et al.
Published: (2025)
by: Wei, Zhao, et al.
Published: (2025)
Preference-based Multi-Objective Reinforcement Learning
by: Mu, Ni, et al.
Published: (2025)
by: Mu, Ni, et al.
Published: (2025)
Similar Items
-
Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots
by: Hung, Wei, et al.
Published: (2022) -
From Reward-Free Representations to Preferences: Rethinking Offline Preference-Based Reinforcement Learning
by: Yang, Jun-Jie, et al.
Published: (2026) -
Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs
by: Hung, Wei, et al.
Published: (2025) -
Diffusion-Reward Adversarial Imitation Learning
by: Lai, Chun-Mao, et al.
Published: (2024) -
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
by: Hung, Yu-Heng, et al.
Published: (2025)