Saved in:
| Main Authors: | Patel, Shrenik, Truong, Christine |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.02970 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DoubleTake: Contrastive Reasoning for Faithful Decision-Making in Medical Imaging
by: Patel, Daivik, et al.
Published: (2026)
by: Patel, Daivik, et al.
Published: (2026)
FairMarket-RL: LLM-Guided Fairness Shaping for Multi-Agent Reinforcement Learning in Peer-to-Peer Markets
by: Jadhav, Shrenik, et al.
Published: (2025)
by: Jadhav, Shrenik, et al.
Published: (2025)
Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
by: Bhansali, Shrenik, et al.
Published: (2025)
by: Bhansali, Shrenik, et al.
Published: (2025)
M3PO: Massively Multi-Task Model-Based Policy Optimization
by: Narendra, Aditya, et al.
Published: (2025)
by: Narendra, Aditya, et al.
Published: (2025)
Bayesian Calibration of Engine-out NOx Models for Engine-to-Engine Transferability
by: Zinage, Shrenik, et al.
Published: (2025)
by: Zinage, Shrenik, et al.
Published: (2025)
DKL-KAN: Scalable Deep Kernel Learning using Kolmogorov-Arnold Networks
by: Zinage, Shrenik, et al.
Published: (2024)
by: Zinage, Shrenik, et al.
Published: (2024)
A Causal Graph-Enhanced Gaussian Process Regression for Modeling Engine-out NOx
by: Zinage, Shrenik, et al.
Published: (2024)
by: Zinage, Shrenik, et al.
Published: (2024)
Scalable Fairness Shaping with LLM-Guided Multi-Agent Reinforcement Learning for Peer-to-Peer Electricity Markets
by: Jadhav, Shrenik, et al.
Published: (2025)
by: Jadhav, Shrenik, et al.
Published: (2025)
Co-Optimizing Reconfigurable Environments and Policies for Decentralized Multi-Agent Navigation
by: Gao, Zhan, et al.
Published: (2024)
by: Gao, Zhan, et al.
Published: (2024)
HUANet: Hard-Constrained Unrolled ADMM for Constrained Convex Optimization
by: Tran, Trinh, et al.
Published: (2026)
by: Tran, Trinh, et al.
Published: (2026)
Multi-Agent Reinforcement Learning for Unmanned Aerial Vehicle Coordination by Multi-Critic Policy Gradient Optimization
by: Alon, Yoav, et al.
Published: (2020)
by: Alon, Yoav, et al.
Published: (2020)
Deep RL With Information Constrained Policies: Generalization in Continuous Control
by: Malloy, Tailia, et al.
Published: (2020)
by: Malloy, Tailia, et al.
Published: (2020)
Fat-to-Thin Policy Optimization: Offline RL with Sparse Policies
by: Zhu, Lingwei, et al.
Published: (2025)
by: Zhu, Lingwei, et al.
Published: (2025)
Conformal Constrained Policy Optimization for Cost-Effective LLM Agents
by: Si, Wenwen, et al.
Published: (2025)
by: Si, Wenwen, et al.
Published: (2025)
RePO: Bridging On-Policy Learning and Off-Policy Knowledge through Rephrasing Policy Optimization
by: Xia, Linxuan, et al.
Published: (2026)
by: Xia, Linxuan, et al.
Published: (2026)
OPTIMA: Optimized Policy for Intelligent Multi-Agent Systems Enables Coordination-Aware Autonomous Vehicles
by: Du, Rui, et al.
Published: (2024)
by: Du, Rui, et al.
Published: (2024)
RePO: Replay-Enhanced Policy Optimization
by: Li, Siheng, et al.
Published: (2025)
by: Li, Siheng, et al.
Published: (2025)
CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning
by: Hedman, Marcel, et al.
Published: (2026)
by: Hedman, Marcel, et al.
Published: (2026)
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
by: Liu, Shih-Yang, et al.
Published: (2026)
by: Liu, Shih-Yang, et al.
Published: (2026)
Capacity-Aware Planning and Scheduling in Budget-Constrained Multi-Agent MDPs: A Meta-RL Approach
by: Vora, Manav, et al.
Published: (2024)
by: Vora, Manav, et al.
Published: (2024)
State-wise Constrained Policy Optimization
by: Zhao, Weiye, et al.
Published: (2023)
by: Zhao, Weiye, et al.
Published: (2023)
Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning
by: Zhu, Taojie, et al.
Published: (2026)
by: Zhu, Taojie, et al.
Published: (2026)
Soft Policy Optimization: Online Off-Policy RL for Sequence Models
by: Cohen, Taco, et al.
Published: (2025)
by: Cohen, Taco, et al.
Published: (2025)
e-COP : Episodic Constrained Optimization of Policies
by: Agnihotri, Akhil, et al.
Published: (2024)
by: Agnihotri, Akhil, et al.
Published: (2024)
Proactive Constrained Policy Optimization with Preemptive Penalty
by: Yang, Ning, et al.
Published: (2025)
by: Yang, Ning, et al.
Published: (2025)
ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents
by: Patel, Daivik, et al.
Published: (2025)
by: Patel, Daivik, et al.
Published: (2025)
Trust the Batch, On- or Off-Policy: Adaptive Policy Optimization for RL Post-Training
by: Fakoor, Rasool, et al.
Published: (2026)
by: Fakoor, Rasool, et al.
Published: (2026)
Inference Time Policy Optimization for Offline RL with Differentiable World Models
by: Deb, Rohan, et al.
Published: (2026)
by: Deb, Rohan, et al.
Published: (2026)
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
by: Patel, Nishil, et al.
Published: (2023)
by: Patel, Nishil, et al.
Published: (2023)
LEGO: Language Model Building Blocks
by: Bhansali, Shrenik, et al.
Published: (2024)
by: Bhansali, Shrenik, et al.
Published: (2024)
Centralized Adaptive Sampling for Reliable Co-Training of Independent Multi-Agent Policies
by: Corrado, Nicholas E., et al.
Published: (2025)
by: Corrado, Nicholas E., et al.
Published: (2025)
When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs
by: Zeng, Yifan, et al.
Published: (2026)
by: Zeng, Yifan, et al.
Published: (2026)
Ready from Day 1: Population-Aware Coordination for Large-Scale Constrained Multi-Agent Systems
by: Wang, Angel, et al.
Published: (2026)
by: Wang, Angel, et al.
Published: (2026)
Constrained Policy Optimization with Cantelli-Bounded Value-at-Risk
by: Tangri, Rohan, et al.
Published: (2026)
by: Tangri, Rohan, et al.
Published: (2026)
Constrained Group Relative Policy Optimization
by: Girgis, Roger, et al.
Published: (2026)
by: Girgis, Roger, et al.
Published: (2026)
Robustifying a Policy in Multi-Agent RL with Diverse Cooperative Behaviors and Adversarial Style Sampling for Assistive Tasks
by: Osa, Takayuki, et al.
Published: (2024)
by: Osa, Takayuki, et al.
Published: (2024)
Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives
by: Zhang, Qixin, et al.
Published: (2025)
by: Zhang, Qixin, et al.
Published: (2025)
When are LLMs Sufficient Policy Optimizers for Sequential RL Tasks?
by: Hatgis-Kessell, Stephane, et al.
Published: (2026)
by: Hatgis-Kessell, Stephane, et al.
Published: (2026)
Network-Constrained Policy Optimization for Adaptive Multi-agent Vehicle Routing
by: Arasteh, Fazel, et al.
Published: (2025)
by: Arasteh, Fazel, et al.
Published: (2025)
COMPASS: Benchmarking Constrained Optimization in LLM Agents
by: Qin, Tian, et al.
Published: (2025)
by: Qin, Tian, et al.
Published: (2025)
Similar Items
-
DoubleTake: Contrastive Reasoning for Faithful Decision-Making in Medical Imaging
by: Patel, Daivik, et al.
Published: (2026) -
FairMarket-RL: LLM-Guided Fairness Shaping for Multi-Agent Reinforcement Learning in Peer-to-Peer Markets
by: Jadhav, Shrenik, et al.
Published: (2025) -
Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
by: Bhansali, Shrenik, et al.
Published: (2025) -
M3PO: Massively Multi-Task Model-Based Policy Optimization
by: Narendra, Aditya, et al.
Published: (2025) -
Bayesian Calibration of Engine-out NOx Models for Engine-to-Engine Transferability
by: Zinage, Shrenik, et al.
Published: (2025)