Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2306.09202 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866910777254871040 |
|---|---|
| author | Nakamura, Shintaro Sugiyama, Masashi |
| author_facet | Nakamura, Shintaro Sugiyama, Masashi |
| contents | We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is polynomial with respect to the number of arms. In such a case, the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We introduce an algorithm named the combinatorial gap-based exploration (CombGapE) algorithm, whose sample complexity upper bound matches the lower bound up to a problem-dependent constant factor. We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world datasets. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2306_09202 |
| institution | arXiv |
| publishDate | 2023 |
| record_format | arxiv |
| spellingShingle | A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit Nakamura, Shintaro Sugiyama, Masashi Machine Learning We study the real-valued combinatorial pure exploration problem in the stochastic multi-armed bandit (R-CPE-MAB). We study the case where the size of the action set is polynomial with respect to the number of arms. In such a case, the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We introduce an algorithm named the combinatorial gap-based exploration (CombGapE) algorithm, whose sample complexity upper bound matches the lower bound up to a problem-dependent constant factor. We numerically show that the CombGapE algorithm outperforms existing methods significantly in both synthetic and real-world datasets. |
| title | A Fast Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2306.09202 |