Saved in:
Bibliographic Details
Main Authors: Grant, James A., Leslie, David S.
Format: Preprint
Published: 2021
Subjects:
Online Access:https://arxiv.org/abs/2109.14412
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929322102620160
author Grant, James A.
Leslie, David S.
author_facet Grant, James A.
Leslie, David S.
contents We consider a variant of online binary classification where a learner sequentially assigns labels ($0$ or $1$) to items with unknown true class. If, but only if, the learner chooses label $1$ they immediately observe the true label of the item. The learner faces a trade-off between short-term classification accuracy and long-term information gain. This problem has previously been studied under the name of the `apple tasting' problem. We revisit this problem as a partial monitoring problem with side information, and focus on the case where item features are linked to true classes via a logistic regression model. Our principal contribution is a study of the performance of Thompson Sampling (TS) for this problem. Using recently developed information-theoretic tools, we show that TS achieves a Bayesian regret bound of an improved order to previous approaches. Further, we experimentally verify that efficient approximations to TS and Information Directed Sampling via Pólya-Gamma augmentation have superior empirical performance to existing methods.
format Preprint
id arxiv_https___arxiv_org_abs_2109_14412
institution arXiv
publishDate 2021
record_format arxiv
spellingShingle Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification
Grant, James A.
Leslie, David S.
Machine Learning
We consider a variant of online binary classification where a learner sequentially assigns labels ($0$ or $1$) to items with unknown true class. If, but only if, the learner chooses label $1$ they immediately observe the true label of the item. The learner faces a trade-off between short-term classification accuracy and long-term information gain. This problem has previously been studied under the name of the `apple tasting' problem. We revisit this problem as a partial monitoring problem with side information, and focus on the case where item features are linked to true classes via a logistic regression model. Our principal contribution is a study of the performance of Thompson Sampling (TS) for this problem. Using recently developed information-theoretic tools, we show that TS achieves a Bayesian regret bound of an improved order to previous approaches. Further, we experimentally verify that efficient approximations to TS and Information Directed Sampling via Pólya-Gamma augmentation have superior empirical performance to existing methods.
title Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification
topic Machine Learning
url https://arxiv.org/abs/2109.14412