:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bai, Wensong, Zhang, Chao, Fu, Yichao, Zhao, Peilin, Qian, Hui, Dai, Bin
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2306.06637
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Conditional Sequence Modeling for Safe Reinforcement Learning
by: Bai, Wensong, et al.
Published: (2026)

Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
by: Zhang, Qingyang, et al.
Published: (2025)

Distributionally Robust Multimodal Machine Learning
by: Yang, Peilin, et al.
Published: (2025)

When Drafts Evolve: Speculative Decoding Meets Online Learning
by: Qian, Yu-Yang, et al.
Published: (2026)

Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem
by: Li, Chao, et al.
Published: (2025)

PACER: Acyclic Causal Discovery from Large-Scale Interventional Data
by: Torné, Ramon Viñas, et al.
Published: (2026)

Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization
by: Hao, Ruijie, et al.
Published: (2026)

An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks
by: Papadopoulos, George, et al.
Published: (2025)

Diffusion Models for Reinforcement Learning: A Survey
by: Zhu, Zhengbang, et al.
Published: (2023)

1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
by: Gao, Chang, et al.
Published: (2024)

Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
by: Dai, Juntao, et al.
Published: (2024)

Distributionally Robust Federated Learning: An ADMM Algorithm
by: Bai, Wen, et al.
Published: (2025)

Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning
by: Ma, Huan, et al.
Published: (2023)

Analyzing and Bridging the Gap between Maximizing Total Reward and Discounted Reward in Deep Reinforcement Learning
by: Yin, Shuyu, et al.
Published: (2024)

Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning
by: Hao, Chenjie, et al.
Published: (2025)

Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
by: Zhao, Lei, et al.
Published: (2023)

Distributional Inverse Reinforcement Learning
by: Wu, Feiyang, et al.
Published: (2025)

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
by: Wu, Peilin, et al.
Published: (2026)

Quantile Geometry Regularization for Distributional Reinforcement Learning
by: Zhang, Zhaofan, et al.
Published: (2026)

PPFL-RDSN: Privacy-Preserving Federated Learning-based Residual Dense Spatial Networks for Encrypted Lossy Image Reconstruction
by: He, Peilin, et al.
Published: (2025)

Self-Reinforced Graph Contrastive Learning
by: Hsieh, Chou-Ying, et al.
Published: (2025)

Deep Think with Confidence
by: Fu, Yichao, et al.
Published: (2025)

Pushing the Limits of Inverse Lithography with Generative Reinforcement Learning
by: Yang, Haoyu, et al.
Published: (2026)

Learning to Coordinate: Distributed Meta-Trajectory Optimization Via Differentiable ADMM-DDP
by: Wang, Bingheng, et al.
Published: (2025)

Enhanced Penalty-based Bidirectional Reinforcement Learning Algorithms
by: Pula, Sai Gana Sandeep, et al.
Published: (2025)

Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms
by: Zhou, Zehao
Published: (2024)

Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm
by: Lu, Miao, et al.
Published: (2024)

Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning
by: Bai, Hui, et al.
Published: (2024)

SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
by: Bai, Runsheng, et al.
Published: (2024)

A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks
by: Tomic, Slavisa, et al.
Published: (2024)

Interpret Policies in Deep Reinforcement Learning using SILVER with RL-Guided Labeling: A Model-level Approach to High-dimensional and Multi-action Environments
by: Qian, Yiyu, et al.
Published: (2025)

DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning
by: Wang, Yujie, et al.
Published: (2026)

EnhancedRL: An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Recommender Systems
by: Liu, Peng, et al.
Published: (2024)

Safeguarding LLM Fine-tuning via Push-Pull Distributional Alignment
by: Wang, Haozhong, et al.
Published: (2026)

Distributional Reinforcement Learning with Diffusion Bridge Critics
by: Ding, Shutong, et al.
Published: (2026)

Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning
by: Ebrahim, Maad, et al.
Published: (2024)

Theoretically Guaranteed Distribution Adaptable Learning
by: Xu, Chao, et al.
Published: (2024)

Efficient LLM Scheduling by Learning to Rank
by: Fu, Yichao, et al.
Published: (2024)

Trustworthy Efficient Communication for Distributed Learning using LQ-SGD Algorithm
by: Li, Hongyang, et al.
Published: (2025)

Estimation and Inference in Distributional Reinforcement Learning
by: Zhang, Liangyu, et al.
Published: (2023)