Saved in:
| Main Authors: | Landers, Matthew, Killian, Taylor W., Hartvigsen, Thomas, Doryab, Afsaneh |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.04441 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SAINT: Attention-Based Policies for Discrete Combinatorial Action Spaces
by: Landers, Matthew, et al.
Published: (2025)
by: Landers, Matthew, et al.
Published: (2025)
BraVE: Offline Reinforcement Learning for Discrete Combinatorial Action Spaces
by: Landers, Matthew, et al.
Published: (2024)
by: Landers, Matthew, et al.
Published: (2024)
Coordination Matters: Evaluation of Cooperative Multi-Agent Reinforcement Learning
by: Cardei, Maria Ana, et al.
Published: (2026)
by: Cardei, Maria Ana, et al.
Published: (2026)
Factorized Deep Q-Network for Cooperative Multi-Agent Reinforcement Learning in Victim Tagging
by: Cardei, Maria Ana, et al.
Published: (2025)
by: Cardei, Maria Ana, et al.
Published: (2025)
Action-Free Offline-to-Online RL via Discretised State Policies
by: Neggatu, Natinael Solomon, et al.
Published: (2026)
by: Neggatu, Natinael Solomon, et al.
Published: (2026)
Improving Offline RL by Blending Heuristics
by: Geng, Sinong, et al.
Published: (2023)
by: Geng, Sinong, et al.
Published: (2023)
Fat-to-Thin Policy Optimization: Offline RL with Sparse Policies
by: Zhu, Lingwei, et al.
Published: (2025)
by: Zhu, Lingwei, et al.
Published: (2025)
Dynamic Neighborhood Construction for Structured Large Discrete Action Spaces
by: Akkerman, Fabian, et al.
Published: (2023)
by: Akkerman, Fabian, et al.
Published: (2023)
Scalable Offline Model-Based RL with Action Chunks
by: Park, Kwanyoung, et al.
Published: (2025)
by: Park, Kwanyoung, et al.
Published: (2025)
Flow Matching for Offline Reinforcement Learning with Discrete Actions
by: Khan, Fairoz Nower, et al.
Published: (2026)
by: Khan, Fairoz Nower, et al.
Published: (2026)
Constrained Discrete Diffusion
by: Cardei, Michael, et al.
Published: (2025)
by: Cardei, Michael, et al.
Published: (2025)
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
by: Mark, Max Sobol, et al.
Published: (2024)
by: Mark, Max Sobol, et al.
Published: (2024)
Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL
by: Zurek, Matthew, et al.
Published: (2025)
by: Zurek, Matthew, et al.
Published: (2025)
Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners
by: Muslimani, Calarina, et al.
Published: (2025)
by: Muslimani, Calarina, et al.
Published: (2025)
HIQL: Offline Goal-Conditioned RL with Latent States as Actions
by: Park, Seohong, et al.
Published: (2023)
by: Park, Seohong, et al.
Published: (2023)
Inference Time Policy Optimization for Offline RL with Differentiable World Models
by: Deb, Rohan, et al.
Published: (2026)
by: Deb, Rohan, et al.
Published: (2026)
An Investigation of Offline Reinforcement Learning in Factorisable Action Spaces
by: Beeson, Alex, et al.
Published: (2024)
by: Beeson, Alex, et al.
Published: (2024)
DEAS: DEtached value learning with Action Sequence for Scalable Offline RL
by: Kim, Changyeon, et al.
Published: (2025)
by: Kim, Changyeon, et al.
Published: (2025)
GEM: Guided Expectation-Maximization for Behavior-Normalized Candidate Action Selection in Offline RL
by: Wang, Haoyu, et al.
Published: (2026)
by: Wang, Haoyu, et al.
Published: (2026)
Modular Diffusion Policy Training: Decoupling and Recombining Guidance and Diffusion for Offline RL
by: Chen, Zhaoyang, et al.
Published: (2025)
by: Chen, Zhaoyang, et al.
Published: (2025)
Offline RL for Adaptive Policy Retrieval in Prior Authorization
by: Sharifullin, Ruslan, et al.
Published: (2026)
by: Sharifullin, Ruslan, et al.
Published: (2026)
Actor-Accelerated Policy Dual Averaging for Reinforcement Learning in Continuous Action Spaces
by: Gao, Ji, et al.
Published: (2026)
by: Gao, Ji, et al.
Published: (2026)
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces
by: Hu, Jifeng, et al.
Published: (2024)
by: Hu, Jifeng, et al.
Published: (2024)
Discretizing Continuous Action Space with Unimodal Probability Distributions for On-Policy Reinforcement Learning
by: Zhu, Yuanyang, et al.
Published: (2024)
by: Zhu, Yuanyang, et al.
Published: (2024)
Stochastic Q-learning for Large Discrete Action Spaces
by: Fourati, Fares, et al.
Published: (2024)
by: Fourati, Fares, et al.
Published: (2024)
A Hardware-Aware, Per-Layer Methodology for Post-Training Quantization of Large Language Models
by: Killian, Earl
Published: (2026)
by: Killian, Earl
Published: (2026)
Dataset Clustering for Improved Offline Policy Learning
by: Wang, Qiang, et al.
Published: (2024)
by: Wang, Qiang, et al.
Published: (2024)
Streetwise Agents: Empowering Offline RL Policies to Outsmart Exogenous Stochastic Disturbances in RTC
by: Soni, Aditya, et al.
Published: (2024)
by: Soni, Aditya, et al.
Published: (2024)
An Empirical Risk Minimization Approach for Offline Inverse RL and Dynamic Discrete Choice Model
by: Kang, Enoch H., et al.
Published: (2025)
by: Kang, Enoch H., et al.
Published: (2025)
Rewarded Region Replay (R3) for Policy Learning with Discrete Action Space
by: Li, Bangzheng, et al.
Published: (2024)
by: Li, Bangzheng, et al.
Published: (2024)
Chain-of-Goals Hierarchical Policy for Long-Horizon Offline Goal-Conditioned RL
by: Choi, Jinwoo, et al.
Published: (2026)
by: Choi, Jinwoo, et al.
Published: (2026)
Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption
by: He, Longxiang, et al.
Published: (2025)
by: He, Longxiang, et al.
Published: (2025)
Budgeting Counterfactual for Offline RL
by: Liu, Yao, et al.
Published: (2023)
by: Liu, Yao, et al.
Published: (2023)
Accelerating Energy-Efficient Federated Learning in Cell-Free Networks with Adaptive Quantization
by: Mahmoudi, Afsaneh, et al.
Published: (2024)
by: Mahmoudi, Afsaneh, et al.
Published: (2024)
SPAARS: Safer RL Policy Alignment through Abstract Exploration and Refined Exploitation of Action Space
by: K, Swaminathan S, et al.
Published: (2026)
by: K, Swaminathan S, et al.
Published: (2026)
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
by: Zhang, Tianle, et al.
Published: (2024)
by: Zhang, Tianle, et al.
Published: (2024)
Policy Optimization in Hybrid Discrete-Continuous Action Spaces via Mixed Gradients
by: Alvo, Matias, et al.
Published: (2026)
by: Alvo, Matias, et al.
Published: (2026)
Accelerating Diffusion Planners in Offline RL via Reward-Aware Consistency Trajectory Distillation
by: Duan, Xintong, et al.
Published: (2025)
by: Duan, Xintong, et al.
Published: (2025)
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
by: Zhan, Wenhao, et al.
Published: (2024)
by: Zhan, Wenhao, et al.
Published: (2024)
Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only
by: Xiao, Wei, et al.
Published: (2025)
by: Xiao, Wei, et al.
Published: (2025)
Similar Items
-
SAINT: Attention-Based Policies for Discrete Combinatorial Action Spaces
by: Landers, Matthew, et al.
Published: (2025) -
BraVE: Offline Reinforcement Learning for Discrete Combinatorial Action Spaces
by: Landers, Matthew, et al.
Published: (2024) -
Coordination Matters: Evaluation of Cooperative Multi-Agent Reinforcement Learning
by: Cardei, Maria Ana, et al.
Published: (2026) -
Factorized Deep Q-Network for Cooperative Multi-Agent Reinforcement Learning in Victim Tagging
by: Cardei, Maria Ana, et al.
Published: (2025) -
Action-Free Offline-to-Online RL via Discretised State Policies
by: Neggatu, Natinael Solomon, et al.
Published: (2026)