Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2021
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2109.14412 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866929322102620160 |
|---|---|
| author | Grant, James A. Leslie, David S. |
| author_facet | Grant, James A. Leslie, David S. |
| contents | We consider a variant of online binary classification where a learner sequentially assigns labels ($0$ or $1$) to items with unknown true class. If, but only if, the learner chooses label $1$ they immediately observe the true label of the item. The learner faces a trade-off between short-term classification accuracy and long-term information gain. This problem has previously been studied under the name of the `apple tasting' problem. We revisit this problem as a partial monitoring problem with side information, and focus on the case where item features are linked to true classes via a logistic regression model. Our principal contribution is a study of the performance of Thompson Sampling (TS) for this problem. Using recently developed information-theoretic tools, we show that TS achieves a Bayesian regret bound of an improved order to previous approaches. Further, we experimentally verify that efficient approximations to TS and Information Directed Sampling via Pólya-Gamma augmentation have superior empirical performance to existing methods. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2109_14412 |
| institution | arXiv |
| publishDate | 2021 |
| record_format | arxiv |
| spellingShingle | Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification Grant, James A. Leslie, David S. Machine Learning We consider a variant of online binary classification where a learner sequentially assigns labels ($0$ or $1$) to items with unknown true class. If, but only if, the learner chooses label $1$ they immediately observe the true label of the item. The learner faces a trade-off between short-term classification accuracy and long-term information gain. This problem has previously been studied under the name of the `apple tasting' problem. We revisit this problem as a partial monitoring problem with side information, and focus on the case where item features are linked to true classes via a logistic regression model. Our principal contribution is a study of the performance of Thompson Sampling (TS) for this problem. Using recently developed information-theoretic tools, we show that TS achieves a Bayesian regret bound of an improved order to previous approaches. Further, we experimentally verify that efficient approximations to TS and Information Directed Sampling via Pólya-Gamma augmentation have superior empirical performance to existing methods. |
| title | Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2109.14412 |