:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Haoran, Zhang, Zicheng, Luo, Wang, Han, Congying, Hu, Yudong, Guo, Tiande, Liao, Shichen
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2402.02165
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Optimal Adversarial Robust Reinforcement Learning with Infinity Measurement Error
by: Li, Haoran, et al.
Published: (2025)

On the Tension Between Optimality and Adversarial Robustness in Policy Optimization
by: Li, Haoran, et al.
Published: (2025)

Dual Alignment Maximin Optimization for Offline Model-based RL
by: Zhou, Chi, et al.
Published: (2025)

Mitigating Distribution Shift in Model-based Offline RL via Shifts-aware Reward Learning
by: Luo, Wang, et al.
Published: (2024)

Purity Law for Generalizable Neural TSP Solvers
by: Liu, Wenzhao, et al.
Published: (2025)

Robust Accelerated Adaptive Search: High-Probability Complexity Bounds under Bounded-Moment Stochastic Oracles
by: Zhang, Shunzhi, et al.
Published: (2026)

A Fast Anti-Jamming Cognitive Radar Deployment Algorithm Based on Reinforcement Learning
by: Cai, Wencheng, et al.
Published: (2025)

Understanding Oversmoothing in Diffusion-Based GNNs From the Perspective of Operator Semigroup Theory
by: Zhao, Weichen, et al.
Published: (2024)

Applying Opponent Modeling for Automatic Bidding in Online Repeated Auctions
by: Hu, Yudong, et al.
Published: (2022)

A-PSRO: A Unified Strategy Learning Method with Advantage Function for Normal-form Games
by: Hu, Yudong, et al.
Published: (2023)

On the optimal pivot path of simplex method for linear programming based on reinforcement learning
by: Li, Anqi, et al.
Published: (2022)

Preference-based opponent shaping in differentiable games
by: Qiao, Xinyu, et al.
Published: (2024)

DR-BFR: Degradation Representation with Diffusion Models for Blind Face Restoration
by: Qiu, Xinmin, et al.
Published: (2024)

Resolving Endpoint Underfitting in Diffusion Bridges via Noise Alignment
by: Gao, Yurong, et al.
Published: (2026)

A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond
by: Hu, Zicheng, et al.
Published: (2025)

Understanding the theoretical properties of projected Bellman equation, linear Q-learning, and approximate value iteration
by: Lim, Han-Dong, et al.
Published: (2025)

Towards Blackwell Optimality: Bellman Optimality Is All You Can Get
by: Boone, Victor, et al.
Published: (2025)

Relating Checkpoint Update Probabilities to Momentum Parameters in Single-Loop Variance Reduction Methods
by: Liu, Hai, et al.
Published: (2026)

ShiQ: Bringing back Bellman to LLMs
by: Clavier, Pierre, et al.
Published: (2025)

Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
by: Omura, Motoki, et al.
Published: (2024)

StyO: Stylize Your Face in Only One-shot
by: Li, Bonan, et al.
Published: (2023)

Bellman Optimality of Average-Reward Robust Markov Decision Processes with a Constant Gain
by: Wang, Shengbo, et al.
Published: (2025)

BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
by: Qiu, Xinmin, et al.
Published: (2024)

Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints
by: Xu, Tian, et al.
Published: (2026)

Hierarchical Refinement: Optimal Transport to Infinity and Beyond
by: Halmos, Peter, et al.
Published: (2025)

Fitted $Q$ Evaluation Without Bellman Completeness via Stationary Weighting
by: van der Laan, Lars, et al.
Published: (2025)

Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning
by: Omura, Motoki, et al.
Published: (2025)

Certifiably Safe Manipulation of Deformable Linear Objects via Joint Shape and Tension Prediction
by: Zhang, Yiting, et al.
Published: (2025)

Robust Decentralized Multi-armed Bandits: From Corruption-Resilience to Byzantine-Resilience
by: Hu, Zicheng, et al.
Published: (2025)

Intriguing Frequency Interpretation of Adversarial Robustness for CNNs and ViTs
by: Chen, Lu, et al.
Published: (2025)

A Mutil-conditional Diffusion Transformer for Versatile Seismic Wave Generation
by: Duan, Longfei, et al.
Published: (2025)

Regularized Q-learning
by: Lim, Han-Dong, et al.
Published: (2022)

Multi-Modal Data Fusion for Moisture Content Prediction in Apple Drying
by: Li, Shichen, et al.
Published: (2025)

Bellman operator convergence enhancements in reinforcement learning algorithms
by: Kadurha, David Krame, et al.
Published: (2025)

Towards Fair Class-wise Robustness: Class Optimal Distribution Adversarial Training
by: Zhi, Hongxin, et al.
Published: (2025)

Coupled VAE: Improved Accuracy and Robustness of a Variational Autoencoder
by: Cao, Shichen, et al.
Published: (2019)

Bellman Optimal Stepsize Straightening of Flow-Matching Models
by: Nguyen, Bao, et al.
Published: (2023)

Bellman Error Centering
by: Chen, Xingguo, et al.
Published: (2025)

Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy
by: Zheng, Xiang, et al.
Published: (2023)

XPG-RL: Reinforcement Learning with Explainable Priority Guidance for Efficiency-Boosted Mechanical Search
by: Zhang, Yiting, et al.
Published: (2025)