Saved in:
| Main Authors: | Bai, Wensong, Zhang, Chao, Fu, Yichao, Zhao, Peilin, Qian, Hui, Dai, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2306.06637 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Conditional Sequence Modeling for Safe Reinforcement Learning
by: Bai, Wensong, et al.
Published: (2026)
by: Bai, Wensong, et al.
Published: (2026)
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
by: Zhang, Qingyang, et al.
Published: (2025)
by: Zhang, Qingyang, et al.
Published: (2025)
Distributionally Robust Multimodal Machine Learning
by: Yang, Peilin, et al.
Published: (2025)
by: Yang, Peilin, et al.
Published: (2025)
When Drafts Evolve: Speculative Decoding Meets Online Learning
by: Qian, Yu-Yang, et al.
Published: (2026)
by: Qian, Yu-Yang, et al.
Published: (2026)
Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem
by: Li, Chao, et al.
Published: (2025)
by: Li, Chao, et al.
Published: (2025)
PACER: Acyclic Causal Discovery from Large-Scale Interventional Data
by: Torné, Ramon Viñas, et al.
Published: (2026)
by: Torné, Ramon Viñas, et al.
Published: (2026)
Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization
by: Hao, Ruijie, et al.
Published: (2026)
by: Hao, Ruijie, et al.
Published: (2026)
An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks
by: Papadopoulos, George, et al.
Published: (2025)
by: Papadopoulos, George, et al.
Published: (2025)
Diffusion Models for Reinforcement Learning: A Survey
by: Zhu, Zhengbang, et al.
Published: (2023)
by: Zhu, Zhengbang, et al.
Published: (2023)
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
by: Gao, Chang, et al.
Published: (2024)
by: Gao, Chang, et al.
Published: (2024)
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
by: Dai, Juntao, et al.
Published: (2024)
by: Dai, Juntao, et al.
Published: (2024)
Distributionally Robust Federated Learning: An ADMM Algorithm
by: Bai, Wen, et al.
Published: (2025)
by: Bai, Wen, et al.
Published: (2025)
Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning
by: Ma, Huan, et al.
Published: (2023)
by: Ma, Huan, et al.
Published: (2023)
Analyzing and Bridging the Gap between Maximizing Total Reward and Discounted Reward in Deep Reinforcement Learning
by: Yin, Shuyu, et al.
Published: (2024)
by: Yin, Shuyu, et al.
Published: (2024)
Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning
by: Hao, Chenjie, et al.
Published: (2025)
by: Hao, Chenjie, et al.
Published: (2025)
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
by: Zhao, Lei, et al.
Published: (2023)
by: Zhao, Lei, et al.
Published: (2023)
Distributional Inverse Reinforcement Learning
by: Wu, Feiyang, et al.
Published: (2025)
by: Wu, Feiyang, et al.
Published: (2025)
AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
by: Wu, Peilin, et al.
Published: (2026)
by: Wu, Peilin, et al.
Published: (2026)
Quantile Geometry Regularization for Distributional Reinforcement Learning
by: Zhang, Zhaofan, et al.
Published: (2026)
by: Zhang, Zhaofan, et al.
Published: (2026)
PPFL-RDSN: Privacy-Preserving Federated Learning-based Residual Dense Spatial Networks for Encrypted Lossy Image Reconstruction
by: He, Peilin, et al.
Published: (2025)
by: He, Peilin, et al.
Published: (2025)
Self-Reinforced Graph Contrastive Learning
by: Hsieh, Chou-Ying, et al.
Published: (2025)
by: Hsieh, Chou-Ying, et al.
Published: (2025)
Deep Think with Confidence
by: Fu, Yichao, et al.
Published: (2025)
by: Fu, Yichao, et al.
Published: (2025)
Pushing the Limits of Inverse Lithography with Generative Reinforcement Learning
by: Yang, Haoyu, et al.
Published: (2026)
by: Yang, Haoyu, et al.
Published: (2026)
Learning to Coordinate: Distributed Meta-Trajectory Optimization Via Differentiable ADMM-DDP
by: Wang, Bingheng, et al.
Published: (2025)
by: Wang, Bingheng, et al.
Published: (2025)
Enhanced Penalty-based Bidirectional Reinforcement Learning Algorithms
by: Pula, Sai Gana Sandeep, et al.
Published: (2025)
by: Pula, Sai Gana Sandeep, et al.
Published: (2025)
Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms
by: Zhou, Zehao
Published: (2024)
by: Zhou, Zehao
Published: (2024)
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
by: Lu, Miao, et al.
Published: (2024)
by: Lu, Miao, et al.
Published: (2024)
Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning
by: Bai, Hui, et al.
Published: (2024)
by: Bai, Hui, et al.
Published: (2024)
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
by: Bai, Runsheng, et al.
Published: (2024)
by: Bai, Runsheng, et al.
Published: (2024)
A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks
by: Tomic, Slavisa, et al.
Published: (2024)
by: Tomic, Slavisa, et al.
Published: (2024)
Interpret Policies in Deep Reinforcement Learning using SILVER with RL-Guided Labeling: A Model-level Approach to High-dimensional and Multi-action Environments
by: Qian, Yiyu, et al.
Published: (2025)
by: Qian, Yiyu, et al.
Published: (2025)
DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning
by: Wang, Yujie, et al.
Published: (2026)
by: Wang, Yujie, et al.
Published: (2026)
EnhancedRL: An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Recommender Systems
by: Liu, Peng, et al.
Published: (2024)
by: Liu, Peng, et al.
Published: (2024)
Safeguarding LLM Fine-tuning via Push-Pull Distributional Alignment
by: Wang, Haozhong, et al.
Published: (2026)
by: Wang, Haozhong, et al.
Published: (2026)
Distributional Reinforcement Learning with Diffusion Bridge Critics
by: Ding, Shutong, et al.
Published: (2026)
by: Ding, Shutong, et al.
Published: (2026)
Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning
by: Ebrahim, Maad, et al.
Published: (2024)
by: Ebrahim, Maad, et al.
Published: (2024)
Theoretically Guaranteed Distribution Adaptable Learning
by: Xu, Chao, et al.
Published: (2024)
by: Xu, Chao, et al.
Published: (2024)
Efficient LLM Scheduling by Learning to Rank
by: Fu, Yichao, et al.
Published: (2024)
by: Fu, Yichao, et al.
Published: (2024)
Trustworthy Efficient Communication for Distributed Learning using LQ-SGD Algorithm
by: Li, Hongyang, et al.
Published: (2025)
by: Li, Hongyang, et al.
Published: (2025)
Estimation and Inference in Distributional Reinforcement Learning
by: Zhang, Liangyu, et al.
Published: (2023)
by: Zhang, Liangyu, et al.
Published: (2023)
Similar Items
-
Conditional Sequence Modeling for Safe Reinforcement Learning
by: Bai, Wensong, et al.
Published: (2026) -
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
by: Zhang, Qingyang, et al.
Published: (2025) -
Distributionally Robust Multimodal Machine Learning
by: Yang, Peilin, et al.
Published: (2025) -
When Drafts Evolve: Speculative Decoding Meets Online Learning
by: Qian, Yu-Yang, et al.
Published: (2026) -
Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem
by: Li, Chao, et al.
Published: (2025)