Saved in:
| Main Authors: | Levine, Jacob Ede, Luo, Yun Lyan, Kosaraju, Sai Chandra |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.04521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
by: Frans, Kevin, et al.
Published: (2024)
by: Frans, Kevin, et al.
Published: (2024)
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
by: Ma, Haozhe, et al.
Published: (2024)
by: Ma, Haozhe, et al.
Published: (2024)
Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
by: Ma, Haozhe, et al.
Published: (2024)
by: Ma, Haozhe, et al.
Published: (2024)
UniMAP: Universal SMILES-Graph Representation Learning
by: Feng, Shikun, et al.
Published: (2023)
by: Feng, Shikun, et al.
Published: (2023)
Exploring Pass-Rate Reward in Reinforcement Learning for Code Generation
by: Li, Xin-Ye, et al.
Published: (2026)
by: Li, Xin-Ye, et al.
Published: (2026)
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning
by: Xu, Charles, et al.
Published: (2024)
by: Xu, Charles, et al.
Published: (2024)
Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback
by: Kim, Gihoon, et al.
Published: (2026)
by: Kim, Gihoon, et al.
Published: (2026)
BERT Learns (and Teaches) Chemistry
by: Payne, Josh, et al.
Published: (2020)
by: Payne, Josh, et al.
Published: (2020)
Reinforcement Learning with Action Chunking
by: Li, Qiyang, et al.
Published: (2025)
by: Li, Qiyang, et al.
Published: (2025)
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
by: Qu, Yun, et al.
Published: (2024)
by: Qu, Yun, et al.
Published: (2024)
Contextual Rollout Bandits for Reinforcement Learning with Verifiable Rewards
by: Lu, Xiaodong, et al.
Published: (2026)
by: Lu, Xiaodong, et al.
Published: (2026)
HLS-Seek: QoR-Aware Code Generation for High-Level Synthesis via Proxy Comparative Reward Reinforcement Learning
by: Zou, Qingyun, et al.
Published: (2026)
by: Zou, Qingyun, et al.
Published: (2026)
Reinforcement Learning with Exogenous States and Rewards
by: Trimponias, George, et al.
Published: (2023)
by: Trimponias, George, et al.
Published: (2023)
Reinforcement Learning with Symbolic Reward Machines
by: Krug, Thomas, et al.
Published: (2026)
by: Krug, Thomas, et al.
Published: (2026)
Reinforcement Learning with Stochastic Reward Machines
by: Corazza, Jan, et al.
Published: (2025)
by: Corazza, Jan, et al.
Published: (2025)
Offline Reinforcement Learning with Imputed Rewards
by: Romeo, Carlo, et al.
Published: (2024)
by: Romeo, Carlo, et al.
Published: (2024)
Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis
by: Giordano, Sara, et al.
Published: (2025)
by: Giordano, Sara, et al.
Published: (2025)
t-SMILES: A Scalable Fragment-based Molecular Representation Framework for De Novo Molecule Generation
by: Wu, Juan-Ni, et al.
Published: (2023)
by: Wu, Juan-Ni, et al.
Published: (2023)
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
by: Song, Kefan, et al.
Published: (2025)
by: Song, Kefan, et al.
Published: (2025)
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
by: Lee, Jaewoo, et al.
Published: (2024)
by: Lee, Jaewoo, et al.
Published: (2024)
PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators
by: Earle, Sam, et al.
Published: (2024)
by: Earle, Sam, et al.
Published: (2024)
Generalizing Behavior via Inverse Reinforcement Learning with Closed-Form Reward Centroids
by: Lazzati, Filippo, et al.
Published: (2025)
by: Lazzati, Filippo, et al.
Published: (2025)
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
by: Lee, Younghwan, et al.
Published: (2025)
by: Lee, Younghwan, et al.
Published: (2025)
Constraints as Rewards: Reinforcement Learning for Robots without Reward Functions
by: Ishihara, Yu, et al.
Published: (2025)
by: Ishihara, Yu, et al.
Published: (2025)
Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
by: Xie, Tianbao, et al.
Published: (2023)
by: Xie, Tianbao, et al.
Published: (2023)
Exploration by Random Reward Perturbation
by: Ma, Haozhe, et al.
Published: (2025)
by: Ma, Haozhe, et al.
Published: (2025)
Beyond Rewards in Reinforcement Learning for Cyber Defence
by: Bates, Elizabeth, et al.
Published: (2026)
by: Bates, Elizabeth, et al.
Published: (2026)
RLSR: Reinforcement Learning from Self Reward
by: Simonds, Toby, et al.
Published: (2025)
by: Simonds, Toby, et al.
Published: (2025)
Efficient Reinforcement Learning in Probabilistic Reward Machines
by: Lin, Xiaofeng, et al.
Published: (2024)
by: Lin, Xiaofeng, et al.
Published: (2024)
Which Rewards Matter? Reward Selection for Reinforcement Learning under Limited Feedback
by: Chaudhari, Shreyas, et al.
Published: (2025)
by: Chaudhari, Shreyas, et al.
Published: (2025)
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
by: Zhu, Jin, et al.
Published: (2023)
by: Zhu, Jin, et al.
Published: (2023)
RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks
by: Wu, Mian, et al.
Published: (2025)
by: Wu, Mian, et al.
Published: (2025)
What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning
by: Ma, Yiran, et al.
Published: (2024)
by: Ma, Yiran, et al.
Published: (2024)
Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale
by: Roth, Amit, et al.
Published: (2026)
by: Roth, Amit, et al.
Published: (2026)
DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks
by: Mu, Tongzhou, et al.
Published: (2024)
by: Mu, Tongzhou, et al.
Published: (2024)
MIR: Efficient Exploration in Episodic Multi-Agent Reinforcement Learning via Mutual Intrinsic Reward
by: Chen, Kesheng, et al.
Published: (2025)
by: Chen, Kesheng, et al.
Published: (2025)
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning
by: Yoshihara, Hiroshi, et al.
Published: (2025)
by: Yoshihara, Hiroshi, et al.
Published: (2025)
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
by: Feng, Yunhai, et al.
Published: (2025)
by: Feng, Yunhai, et al.
Published: (2025)
Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes
by: Rojas, Juan Sebastian, et al.
Published: (2024)
by: Rojas, Juan Sebastian, et al.
Published: (2024)
Adaptive Correlation-Weighted Intrinsic Rewards for Reinforcement Learning
by: Nguyen, Viet Bac, et al.
Published: (2026)
by: Nguyen, Viet Bac, et al.
Published: (2026)
Similar Items
-
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
by: Frans, Kevin, et al.
Published: (2024) -
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
by: Ma, Haozhe, et al.
Published: (2024) -
Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
by: Ma, Haozhe, et al.
Published: (2024) -
UniMAP: Universal SMILES-Graph Representation Learning
by: Feng, Shikun, et al.
Published: (2023) -
Exploring Pass-Rate Reward in Reinforcement Learning for Code Generation
by: Li, Xin-Ye, et al.
Published: (2026)