:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Levine, Jacob Ede, Luo, Yun Lyan, Kosaraju, Sai Chandra
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.04521
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
by: Frans, Kevin, et al.
Published: (2024)

Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
by: Ma, Haozhe, et al.
Published: (2024)

Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
by: Ma, Haozhe, et al.
Published: (2024)

UniMAP: Universal SMILES-Graph Representation Learning
by: Feng, Shikun, et al.
Published: (2023)

Exploring Pass-Rate Reward in Reinforcement Learning for Code Generation
by: Li, Xin-Ye, et al.
Published: (2026)

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning
by: Xu, Charles, et al.
Published: (2024)

Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback
by: Kim, Gihoon, et al.
Published: (2026)

BERT Learns (and Teaches) Chemistry
by: Payne, Josh, et al.
Published: (2020)

Reinforcement Learning with Action Chunking
by: Li, Qiyang, et al.
Published: (2025)

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
by: Qu, Yun, et al.
Published: (2024)

Contextual Rollout Bandits for Reinforcement Learning with Verifiable Rewards
by: Lu, Xiaodong, et al.
Published: (2026)

HLS-Seek: QoR-Aware Code Generation for High-Level Synthesis via Proxy Comparative Reward Reinforcement Learning
by: Zou, Qingyun, et al.
Published: (2026)

Reinforcement Learning with Exogenous States and Rewards
by: Trimponias, George, et al.
Published: (2023)

Reinforcement Learning with Symbolic Reward Machines
by: Krug, Thomas, et al.
Published: (2026)

Reinforcement Learning with Stochastic Reward Machines
by: Corazza, Jan, et al.
Published: (2025)

Offline Reinforcement Learning with Imputed Rewards
by: Romeo, Carlo, et al.
Published: (2024)

Hybrid Reward-Driven Reinforcement Learning for Efficient Quantum Circuit Synthesis
by: Giordano, Sara, et al.
Published: (2025)

t-SMILES: A Scalable Fragment-based Molecular Representation Framework for De Novo Molecule Generation
by: Wu, Juan-Ni, et al.
Published: (2023)

Reward Is Enough: LLMs Are In-Context Reinforcement Learners
by: Song, Kefan, et al.
Published: (2025)

GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
by: Lee, Jaewoo, et al.
Published: (2024)

PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators
by: Earle, Sam, et al.
Published: (2024)

Generalizing Behavior via Inverse Reinforcement Learning with Closed-Form Reward Centroids
by: Lazzati, Filippo, et al.
Published: (2025)

Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
by: Lee, Younghwan, et al.
Published: (2025)

Constraints as Rewards: Reinforcement Learning for Robots without Reward Functions
by: Ishihara, Yu, et al.
Published: (2025)

Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
by: Xie, Tianbao, et al.
Published: (2023)

Exploration by Random Reward Perturbation
by: Ma, Haozhe, et al.
Published: (2025)

Beyond Rewards in Reinforcement Learning for Cyber Defence
by: Bates, Elizabeth, et al.
Published: (2026)

RLSR: Reinforcement Learning from Self Reward
by: Simonds, Toby, et al.
Published: (2025)

Efficient Reinforcement Learning in Probabilistic Reward Machines
by: Lin, Xiaofeng, et al.
Published: (2024)

Which Rewards Matter? Reward Selection for Reinforcement Learning under Limited Feedback
by: Chaudhari, Shreyas, et al.
Published: (2025)

Robust Offline Reinforcement learning with Heavy-Tailed Rewards
by: Zhu, Jin, et al.
Published: (2023)

RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks
by: Wu, Mian, et al.
Published: (2025)

What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning
by: Ma, Yiran, et al.
Published: (2024)

Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale
by: Roth, Amit, et al.
Published: (2026)

DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks
by: Mu, Tongzhou, et al.
Published: (2024)

MIR: Efficient Exploration in Episodic Multi-Agent Reinforcement Learning via Mutual Intrinsic Reward
by: Chen, Kesheng, et al.
Published: (2025)

A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning
by: Yoshihara, Hiroshi, et al.
Published: (2025)

Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
by: Feng, Yunhai, et al.
Published: (2025)

Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes
by: Rojas, Juan Sebastian, et al.
Published: (2024)

Adaptive Correlation-Weighted Intrinsic Rewards for Reinforcement Learning
by: Nguyen, Viet Bac, et al.
Published: (2026)