Saved in:
| Main Authors: | Subramani, Rohan, Williams, Marcus, Heitmann, Max, Holm, Halfdan, Griffin, Charlie, Skalse, Joar |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.11840 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification
by: Skalse, Joar, et al.
Published: (2024)
by: Skalse, Joar, et al.
Published: (2024)
Partial Identifiability and Misspecification in Inverse Reinforcement Learning
by: Skalse, Joar, et al.
Published: (2024)
by: Skalse, Joar, et al.
Published: (2024)
On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks
by: Skalse, Joar, et al.
Published: (2024)
by: Skalse, Joar, et al.
Published: (2024)
Partial Identifiability in Inverse Reinforcement Learning For Agents With Non-Exponential Discounting
by: Skalse, Joar, et al.
Published: (2024)
by: Skalse, Joar, et al.
Published: (2024)
Defining and Characterizing Reward Hacking
by: Skalse, Joar, et al.
Published: (2022)
by: Skalse, Joar, et al.
Published: (2022)
The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret
by: Fluri, Lukas, et al.
Published: (2024)
by: Fluri, Lukas, et al.
Published: (2024)
STARC: A General Framework For Quantifying Differences Between Reward Functions
by: Skalse, Joar, et al.
Published: (2023)
by: Skalse, Joar, et al.
Published: (2023)
Password-Activated Shutdown Protocols for Misaligned Frontier Agents
by: Williams, Kai, et al.
Published: (2025)
by: Williams, Kai, et al.
Published: (2025)
Automating Formal Verification with Reinforcement Learning and Recursive Inference
by: Tan, Max
Published: (2026)
by: Tan, Max
Published: (2026)
Multi-objective Reinforcement learning from AI Feedback
by: Williams, Marcus
Published: (2024)
by: Williams, Marcus
Published: (2024)
Scalable Multi-Objective Robot Reinforcement Learning through Gradient Conflict Resolution
by: Munn, Humphrey, et al.
Published: (2025)
by: Munn, Humphrey, et al.
Published: (2025)
EXPO: Stable Reinforcement Learning with Expressive Policies
by: Dong, Perry, et al.
Published: (2025)
by: Dong, Perry, et al.
Published: (2025)
Balancing Expressivity and Robustness: Constrained Rational Activations for Reinforcement Learning
by: Surdej, Rafał, et al.
Published: (2025)
by: Surdej, Rafał, et al.
Published: (2025)
Towards Formalizing Reinforcement Learning Theory
by: Zhang, Shangtong
Published: (2025)
by: Zhang, Shangtong
Published: (2025)
Polychromic Objectives for Reinforcement Learning
by: Hamid, Jubayer Ibn, et al.
Published: (2025)
by: Hamid, Jubayer Ibn, et al.
Published: (2025)
Pareto Set Learning for Multi-Objective Reinforcement Learning
by: Liu, Erlong, et al.
Published: (2025)
by: Liu, Erlong, et al.
Published: (2025)
Expressive Temporal Specifications for Reward Monitoring
by: Adalat, Omar, et al.
Published: (2025)
by: Adalat, Omar, et al.
Published: (2025)
Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols
by: Griffin, Charlie, et al.
Published: (2024)
by: Griffin, Charlie, et al.
Published: (2024)
Preference-based Multi-Objective Reinforcement Learning
by: Mu, Ni, et al.
Published: (2025)
by: Mu, Ni, et al.
Published: (2025)
Expressive Value Learning for Scalable Offline Reinforcement Learning
by: Espinosa-Dice, Nicolas, et al.
Published: (2025)
by: Espinosa-Dice, Nicolas, et al.
Published: (2025)
Towards Provable Emergence of In-Context Reinforcement Learning
by: Wang, Jiuqi, et al.
Published: (2025)
by: Wang, Jiuqi, et al.
Published: (2025)
Optimistic Reinforcement Learning with Quantile Objectives
by: Alipour-Vaezi, Mohammad, et al.
Published: (2025)
by: Alipour-Vaezi, Mohammad, et al.
Published: (2025)
Objective-Specific Privileged Bases via Full-Prefix Matryoshka Learning
by: Talukder, Arghamitra, et al.
Published: (2026)
by: Talukder, Arghamitra, et al.
Published: (2026)
On Generalization Across Environments In Multi-Objective Reinforcement Learning
by: Teoh, Jayden, et al.
Published: (2025)
by: Teoh, Jayden, et al.
Published: (2025)
Q-Flow: Stable and Expressive Reinforcement Learning with Flow-Based Policy
by: Doo, JaeHyeok, et al.
Published: (2026)
by: Doo, JaeHyeok, et al.
Published: (2026)
Active Learning of Molecular Data for Task-Specific Objectives
by: Ghosh, Kunal, et al.
Published: (2024)
by: Ghosh, Kunal, et al.
Published: (2024)
Reinforcement Learning with $ω$-Regular Objectives and Constraints
by: Wagner, Dominik, et al.
Published: (2025)
by: Wagner, Dominik, et al.
Published: (2025)
Multi-Objective Reinforcement Learning for Water Management
by: Osika, Zuzanna, et al.
Published: (2025)
by: Osika, Zuzanna, et al.
Published: (2025)
Demonstration Guided Multi-Objective Reinforcement Learning
by: Lu, Junlin, et al.
Published: (2024)
by: Lu, Junlin, et al.
Published: (2024)
The Formalism-Implementation Gap in Reinforcement Learning Research
by: Castro, Pablo Samuel
Published: (2025)
by: Castro, Pablo Samuel
Published: (2025)
Emergence of Chemotactic Strategies with Multi-Agent Reinforcement Learning
by: Tovey, Samuel, et al.
Published: (2024)
by: Tovey, Samuel, et al.
Published: (2024)
A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning
by: Chen, Ying-Tu, et al.
Published: (2026)
by: Chen, Ying-Tu, et al.
Published: (2026)
Theoretical Study of Conflict-Avoidant Multi-Objective Reinforcement Learning
by: Wang, Yudan, et al.
Published: (2024)
by: Wang, Yudan, et al.
Published: (2024)
Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning
by: Park, Giseung, et al.
Published: (2025)
by: Park, Giseung, et al.
Published: (2025)
Breaking the Bias Barrier in Concave Multi-Objective Reinforcement Learning
by: Ganesh, Swetha, et al.
Published: (2026)
by: Ganesh, Swetha, et al.
Published: (2026)
Constrained Multi-Objective Reinforcement Learning with Max-Min Criterion
by: Park, Giseung, et al.
Published: (2026)
by: Park, Giseung, et al.
Published: (2026)
Benchmarking Offline Multi-Objective Reinforcement Learning in Critical Care
by: Bansal, Aryaman, et al.
Published: (2025)
by: Bansal, Aryaman, et al.
Published: (2025)
Multi-Objective Reinforcement Learning for Generating Covalent Inhibitor Candidates
by: Gil, Renee
Published: (2026)
by: Gil, Renee
Published: (2026)
Convergence and Emergence of In-Context Reinforcement Learning with Chain of Thought
by: Xie, Zixuan, et al.
Published: (2026)
by: Xie, Zixuan, et al.
Published: (2026)
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?
by: Mallen, Alex, et al.
Published: (2024)
by: Mallen, Alex, et al.
Published: (2024)
Similar Items
-
Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification
by: Skalse, Joar, et al.
Published: (2024) -
Partial Identifiability and Misspecification in Inverse Reinforcement Learning
by: Skalse, Joar, et al.
Published: (2024) -
On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks
by: Skalse, Joar, et al.
Published: (2024) -
Partial Identifiability in Inverse Reinforcement Learning For Agents With Non-Exponential Discounting
by: Skalse, Joar, et al.
Published: (2024) -
Defining and Characterizing Reward Hacking
by: Skalse, Joar, et al.
Published: (2022)