Saved in:
| Main Authors: | Woo, Jiin, Garakani, Alireza Bagheri, Zhou, Tianchen, Huang, Zhishen, Gao, Yan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.21274 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
by: Woo, Jiin, et al.
Published: (2024)
by: Woo, Jiin, et al.
Published: (2024)
Sample Complexity of Average-Reward Q-Learning: From Single-agent to Federated Reinforcement Learning
by: Jiao, Yuchen, et al.
Published: (2026)
by: Jiao, Yuchen, et al.
Published: (2026)
Owen-Shapley Policy Optimization: A Principled RL Algorithm for Generative Search LLMs
by: Nath, Abhijnan, et al.
Published: (2026)
by: Nath, Abhijnan, et al.
Published: (2026)
ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning
by: Sun, Zhishen, et al.
Published: (2026)
by: Sun, Zhishen, et al.
Published: (2026)
Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations
by: Askin, Baris, et al.
Published: (2026)
by: Askin, Baris, et al.
Published: (2026)
Enhancing Recommendation Diversity by Re-ranking with Large Language Models
by: Carraro, Diego, et al.
Published: (2024)
by: Carraro, Diego, et al.
Published: (2024)
ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models
by: Choi, Yujeong, et al.
Published: (2024)
by: Choi, Yujeong, et al.
Published: (2024)
Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation
by: Kim, Woo Kyung, et al.
Published: (2024)
by: Kim, Woo Kyung, et al.
Published: (2024)
Stochastic Gradient Langevin Dynamics with Variance Reduction
by: Huang, Zhishen, et al.
Published: (2021)
by: Huang, Zhishen, et al.
Published: (2021)
Finite-Time Convergence and Sample Complexity of Actor-Critic Multi-Objective Reinforcement Learning
by: Zhou, Tianchen, et al.
Published: (2024)
by: Zhou, Tianchen, et al.
Published: (2024)
Efficient Reinforcement Learning with Large Language Model Priors
by: Yan, Xue, et al.
Published: (2024)
by: Yan, Xue, et al.
Published: (2024)
Enabling Pareto-Stationarity Exploration in Multi-Objective Reinforcement Learning: A Multi-Objective Weighted-Chebyshev Actor-Critic Approach
by: Hairi, Fnu, et al.
Published: (2025)
by: Hairi, Fnu, et al.
Published: (2025)
Group-Aware Reinforcement Learning for Output Diversity in Large Language Models
by: Anschel, Oron, et al.
Published: (2025)
by: Anschel, Oron, et al.
Published: (2025)
Neighborhood-Order Learning Graph Attention Network for Fake News Detection
by: Lakzaei, Batool, et al.
Published: (2025)
by: Lakzaei, Batool, et al.
Published: (2025)
Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning
by: Sharma, Amit, et al.
Published: (2024)
by: Sharma, Amit, et al.
Published: (2024)
Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models
by: Zhang, Tianchen, et al.
Published: (2024)
by: Zhang, Tianchen, et al.
Published: (2024)
TritonRL: Training LLMs to Think and Code Triton Without Cheating
by: Woo, Jiin, et al.
Published: (2025)
by: Woo, Jiin, et al.
Published: (2025)
The Role of Diversity in In-Context Learning for Large Language Models
by: Xiao, Wenyang, et al.
Published: (2025)
by: Xiao, Wenyang, et al.
Published: (2025)
Brain Effective Connectome based on fMRI and DTI Data: Bayesian Causal Learning and Assessment
by: Bagheri, Abdolmahdi, et al.
Published: (2023)
by: Bagheri, Abdolmahdi, et al.
Published: (2023)
Block-R1: Rethinking the Role of Block Size in Multi-domain Reinforcement Learning for Diffusion Large Language Models
by: Jiang, Yan, et al.
Published: (2026)
by: Jiang, Yan, et al.
Published: (2026)
A Novel Data-Dependent Learning Paradigm for Large Hypothesis Classes
by: Pour, Alireza F., et al.
Published: (2025)
by: Pour, Alireza F., et al.
Published: (2025)
FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data
by: Im, Jiin, et al.
Published: (2024)
by: Im, Jiin, et al.
Published: (2024)
Break the Block: Dynamic-size Reasoning Blocks for Diffusion Large Language Models via Monotonic Entropy Descent with Reinforcement Learning
by: Jiang, Yan, et al.
Published: (2026)
by: Jiang, Yan, et al.
Published: (2026)
AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models
by: Zhang, Jiarui, et al.
Published: (2026)
by: Zhang, Jiarui, et al.
Published: (2026)
Guiding Generative Models to Uncover Diverse and Novel Crystals via Reinforcement Learning
by: Park, Hyunsoo, et al.
Published: (2025)
by: Park, Hyunsoo, et al.
Published: (2025)
Reinforcement Learning from Diverse Human Preferences
by: Xue, Wanqi, et al.
Published: (2023)
by: Xue, Wanqi, et al.
Published: (2023)
Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods
by: Cao, Yuji, et al.
Published: (2024)
by: Cao, Yuji, et al.
Published: (2024)
Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration
by: Sun, Yan, et al.
Published: (2025)
by: Sun, Yan, et al.
Published: (2025)
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation
by: Ji, Luo, et al.
Published: (2024)
by: Ji, Luo, et al.
Published: (2024)
Goal-Guided Efficient Exploration via Large Language Model in Reinforcement Learning
by: Qi, Yajie, et al.
Published: (2025)
by: Qi, Yajie, et al.
Published: (2025)
A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback
by: Laleh, Alireza Rashidi, et al.
Published: (2024)
by: Laleh, Alireza Rashidi, et al.
Published: (2024)
LLM-based Personalized Portfolio Recommender: Integrating Large Language Models and Reinforcement Learning for Intelligent Investment Strategy Optimization
by: Li, Bangyu, et al.
Published: (2025)
by: Li, Bangyu, et al.
Published: (2025)
On Predictability of Reinforcement Learning Dynamics for Large Language Models
by: Cai, Yuchen, et al.
Published: (2025)
by: Cai, Yuchen, et al.
Published: (2025)
Routoo: Learning to Route to Large Language Models Effectively
by: Mohammadshahi, Alireza, et al.
Published: (2024)
by: Mohammadshahi, Alireza, et al.
Published: (2024)
Large Language Models for Intent-Driven Session Recommendations
by: Sun, Zhu, et al.
Published: (2023)
by: Sun, Zhu, et al.
Published: (2023)
Teaching Large Language Models to Reason with Reinforcement Learning
by: Havrilla, Alex, et al.
Published: (2024)
by: Havrilla, Alex, et al.
Published: (2024)
DISPO: Enhancing Training Efficiency and Stability in Reinforcement Learning for Large Language Model Mathematical Reasoning
by: Karaman, Batuhan K., et al.
Published: (2026)
by: Karaman, Batuhan K., et al.
Published: (2026)
RLAX: Large-Scale, Distributed Reinforcement Learning for Large Language Models on TPUs
by: Zhou, Runlong, et al.
Published: (2025)
by: Zhou, Runlong, et al.
Published: (2025)
Enhancing Diffusion Model Guidance through Calibration and Regularization
by: Javid, Seyed Alireza, et al.
Published: (2025)
by: Javid, Seyed Alireza, et al.
Published: (2025)
Large Language Model-Enhanced Reinforcement Learning for Generic Bus Holding Control Strategies
by: Yu, Jiajie, et al.
Published: (2024)
by: Yu, Jiajie, et al.
Published: (2024)
Similar Items
-
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
by: Woo, Jiin, et al.
Published: (2024) -
Sample Complexity of Average-Reward Q-Learning: From Single-agent to Federated Reinforcement Learning
by: Jiao, Yuchen, et al.
Published: (2026) -
Owen-Shapley Policy Optimization: A Principled RL Algorithm for Generative Search LLMs
by: Nath, Abhijnan, et al.
Published: (2026) -
ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning
by: Sun, Zhishen, et al.
Published: (2026) -
Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations
by: Askin, Baris, et al.
Published: (2026)