Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Mao, Chenjie, Zhang, Qiaosheng
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2406.01378
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909413821906944
author	Mao, Chenjie Zhang, Qiaosheng
author_facet	Mao, Chenjie Zhang, Qiaosheng
contents	This paper proposes the first generic fast convergence result in general function approximation for offline decision making problems, which include offline reinforcement learning (RL) and off-policy evaluation (OPE) as special cases. To unify different settings, we introduce a framework called Decision Making with Offline Feedback (DMOF), which captures a wide range of offline decision making problems. Within this framework, we propose a simple yet powerful algorithm called Empirical Decision with Divergence (EDD), whose upper bound can be termed as a coefficient named Empirical Offline Estimation Coefficient (EOEC). We show that EOEC is instance-dependent and actually measures the correlation of the problem. When assuming partial coverage in the dataset, EOEC will reduce in a rate of $1/N$ where $N$ is the size of the dataset, endowing EDD with a fast convergence guarantee. Finally, we complement the above results with a lower bound in the DMOF framework, which further demonstrates the soundness of our theory.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_01378
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	A Fast Convergence Theory for Offline Decision Making Mao, Chenjie Zhang, Qiaosheng Machine Learning This paper proposes the first generic fast convergence result in general function approximation for offline decision making problems, which include offline reinforcement learning (RL) and off-policy evaluation (OPE) as special cases. To unify different settings, we introduce a framework called Decision Making with Offline Feedback (DMOF), which captures a wide range of offline decision making problems. Within this framework, we propose a simple yet powerful algorithm called Empirical Decision with Divergence (EDD), whose upper bound can be termed as a coefficient named Empirical Offline Estimation Coefficient (EOEC). We show that EOEC is instance-dependent and actually measures the correlation of the problem. When assuming partial coverage in the dataset, EOEC will reduce in a rate of $1/N$ where $N$ is the size of the dataset, endowing EDD with a fast convergence guarantee. Finally, we complement the above results with a lower bound in the DMOF framework, which further demonstrates the soundness of our theory.
title	A Fast Convergence Theory for Offline Decision Making
topic	Machine Learning
url	https://arxiv.org/abs/2406.01378

Similar Items