Saved in:
| Main Authors: | Corrado, Nicholas E., Hanna, Josiah P. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.01049 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
by: Corrado, Nicholas E., et al.
Published: (2023)
by: Corrado, Nicholas E., et al.
Published: (2023)
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
by: Corrado, Nicholas E., et al.
Published: (2026)
by: Corrado, Nicholas E., et al.
Published: (2026)
Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates
by: Corrado, Nicholas E., et al.
Published: (2023)
by: Corrado, Nicholas E., et al.
Published: (2023)
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning
by: Corrado, Nicholas E., et al.
Published: (2023)
by: Corrado, Nicholas E., et al.
Published: (2023)
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
by: Zhou, Hongyi, et al.
Published: (2025)
by: Zhou, Hongyi, et al.
Published: (2025)
SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP
by: Mukherjee, Subhojyoti, et al.
Published: (2024)
by: Mukherjee, Subhojyoti, et al.
Published: (2024)
Adaptive Exploration for Data-Efficient General Value Function Evaluations
by: Jain, Arushi, et al.
Published: (2024)
by: Jain, Arushi, et al.
Published: (2024)
SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits
by: Mukherjee, Subhojyoti, et al.
Published: (2023)
by: Mukherjee, Subhojyoti, et al.
Published: (2023)
When Can Model-Free Reinforcement Learning be Enough for Thinking?
by: Hanna, Josiah P., et al.
Published: (2025)
by: Hanna, Josiah P., et al.
Published: (2025)
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
by: Mukherjee, Subhojyoti, et al.
Published: (2024)
by: Mukherjee, Subhojyoti, et al.
Published: (2024)
SMAT: Staged Multi-Agent Training for Co-Adaptive Exoskeleton Control
by: Yuan, Yifei, et al.
Published: (2026)
by: Yuan, Yifei, et al.
Published: (2026)
Policy and World Modeling Co-Training for Language Agents
by: Lu, Ning, et al.
Published: (2026)
by: Lu, Ning, et al.
Published: (2026)
Stable Offline Value Function Learning with Bisimulation-based Representations
by: Pavse, Brahma S., et al.
Published: (2024)
by: Pavse, Brahma S., et al.
Published: (2024)
Trust the Batch, On- or Off-Policy: Adaptive Policy Optimization for RL Post-Training
by: Fakoor, Rasool, et al.
Published: (2026)
by: Fakoor, Rasool, et al.
Published: (2026)
Agent-Agnostic Centralized Training for Decentralized Multi-Agent Cooperative Driving
by: Yan, Shengchao, et al.
Published: (2024)
by: Yan, Shengchao, et al.
Published: (2024)
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
by: Pavse, Brahma S., et al.
Published: (2023)
by: Pavse, Brahma S., et al.
Published: (2023)
Adaptive Sample Sharing for Multi Agent Linear Bandits
by: Cherkaoui, Hamza, et al.
Published: (2023)
by: Cherkaoui, Hamza, et al.
Published: (2023)
AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs
by: Corrado, Nicholas E., et al.
Published: (2025)
by: Corrado, Nicholas E., et al.
Published: (2025)
An Empirical Study on the Power of Future Prediction in Partially Observable Environments
by: Kwon, Jeongyeol, et al.
Published: (2024)
by: Kwon, Jeongyeol, et al.
Published: (2024)
Adaptive Federated LoRA in Heterogeneous Wireless Networks with Independent Sampling
by: Hou, Yanzhao, et al.
Published: (2025)
by: Hou, Yanzhao, et al.
Published: (2025)
Co2PO: Coordinated Constrained Policy Optimization for Multi-Agent RL
by: Patel, Shrenik, et al.
Published: (2026)
by: Patel, Shrenik, et al.
Published: (2026)
MAT-Agent: Adaptive Multi-Agent Training Optimization
by: Zhang, Jusheng, et al.
Published: (2025)
by: Zhang, Jusheng, et al.
Published: (2025)
Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning
by: Xu, Zhuofan, et al.
Published: (2025)
by: Xu, Zhuofan, et al.
Published: (2025)
Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems
by: Karnam, Meghana, et al.
Published: (2026)
by: Karnam, Meghana, et al.
Published: (2026)
Reinforcement Learning via Auxiliary Task Distillation
by: Harish, Abhinav Narayan, et al.
Published: (2024)
by: Harish, Abhinav Narayan, et al.
Published: (2024)
Adaptive Federated Learning in Heterogeneous Wireless Networks with Independent Sampling
by: Geng, Jiaxiang, et al.
Published: (2024)
by: Geng, Jiaxiang, et al.
Published: (2024)
An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning
by: Amato, Christopher
Published: (2024)
by: Amato, Christopher
Published: (2024)
Action-Graph Policies: Learning Action Co-dependencies in Multi-Agent Reinforcement Learning
by: Gupta, Nikunj, et al.
Published: (2026)
by: Gupta, Nikunj, et al.
Published: (2026)
Adaptive Gradient Normalization and Independent Sampling for (Stochastic) Generalized-Smooth Optimization
by: Yang, Yufeng, et al.
Published: (2024)
by: Yang, Yufeng, et al.
Published: (2024)
Sparsely Multimodal Data Fusion
by: Bjorgaard, Josiah
Published: (2024)
by: Bjorgaard, Josiah
Published: (2024)
Co-Optimizing Reconfigurable Environments and Policies for Decentralized Multi-Agent Navigation
by: Gao, Zhan, et al.
Published: (2024)
by: Gao, Zhan, et al.
Published: (2024)
Offline Multi-Agent Reinforcement Learning via In-Sample Sequential Policy Optimization
by: Liu, Zongkai, et al.
Published: (2024)
by: Liu, Zongkai, et al.
Published: (2024)
Rollout-Training Co-Design for Efficient LLM-Based Multi-Agent Reinforcement Learning
by: Jiang, Zhida, et al.
Published: (2026)
by: Jiang, Zhida, et al.
Published: (2026)
CoFi-PGMA: Counterfactual Policy Gradients under Filtered Feedback for Multi-Agent LLMs
by: Tong, Stela, et al.
Published: (2026)
by: Tong, Stela, et al.
Published: (2026)
Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models
by: Zhang, Yang, et al.
Published: (2024)
by: Zhang, Yang, et al.
Published: (2024)
Reliability-Aware Adaptive Self-Consistency for Efficient Sampling in LLM Reasoning
by: Kim, Junseok, et al.
Published: (2026)
by: Kim, Junseok, et al.
Published: (2026)
Fully Independent Communication in Multi-Agent Reinforcement Learning
by: Pina, Rafael, et al.
Published: (2024)
by: Pina, Rafael, et al.
Published: (2024)
Emergent Coordination and Phase Structure in Independent Multi-Agent Reinforcement Learning
by: Yamaguchi, Azusa
Published: (2025)
by: Yamaguchi, Azusa
Published: (2025)
Balanced Training of Energy-Based Models with Adaptive Flow Sampling
by: Grenioux, Louis, et al.
Published: (2023)
by: Grenioux, Louis, et al.
Published: (2023)
Approximate Global Convergence of Independent Learning in Multi-Agent Systems
by: Jin, Ruiyang, et al.
Published: (2024)
by: Jin, Ruiyang, et al.
Published: (2024)
Similar Items
-
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
by: Corrado, Nicholas E., et al.
Published: (2023) -
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
by: Corrado, Nicholas E., et al.
Published: (2026) -
Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates
by: Corrado, Nicholas E., et al.
Published: (2023) -
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning
by: Corrado, Nicholas E., et al.
Published: (2023) -
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
by: Zhou, Hongyi, et al.
Published: (2025)