Saved in:
Bibliographic Details
Main Authors: Maynard-Zhang, Leo, Xiong, Zhihan, Jamieson, Kevin, Fazel, Maryam
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.10346
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912960176193536
author Maynard-Zhang, Leo
Xiong, Zhihan
Jamieson, Kevin
Fazel, Maryam
author_facet Maynard-Zhang, Leo
Xiong, Zhihan
Jamieson, Kevin
Fazel, Maryam
contents We study the fixed-budget best-arm identification (BAI) problem in non-stationary linear bandits. Concretely, given a fixed time budget $T\in \mathbb{N}$, finite arm set $\mathcal{X} \subset \mathbb{R}^d$, and a potentially adversarial sequence of unknown parameters $\lbrace θ_t\rbrace_{t=1}^{T}$ (hence non-stationary), a learner aims to identify the arm with the largest cumulative reward $x_* = \arg\max_{x \in \mathcal{X}} x^\top\sum_{t=1}^T θ_t$ with high probability. In this setting, it is well-known that uniformly sampling arms from the G-optimal design yields a minimax-optimal error probability of $\exp\left(-Θ\left(T / H_{G}\right)\right)$, where $H_{G}$ scales proportionally with the dimension $d$. However, this notion of complexity is overly pessimistic, as it is derived from a lower bound in which the arm set consists only of the standard basis vectors, thus masking any potential advantages arising from arm sets with richer geometric structure. To address this, we establish an arm-set-dependent lower bound that, in contrast, holds for any arm set. Motivated by the ideas underlying our lower bound, we propose the Adjacent-optimal design, a specialization of the well-known $\mathcal{X}\mathcal{Y}$-optimal design, and develop the $\textsf{Adjacent-BAI}$ algorithm. We prove that the error probability of $\textsf{Adjacent-BAI}$ matches our lower bound up to constants, verifying the tightness of our lower bound, and establishing the arm-set-dependent complexity of this setting.
format Preprint
id arxiv_https___arxiv_org_abs_2603_10346
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle On The Complexity of Best-Arm Identification in Non-Stationary Linear Bandits
Maynard-Zhang, Leo
Xiong, Zhihan
Jamieson, Kevin
Fazel, Maryam
Machine Learning
We study the fixed-budget best-arm identification (BAI) problem in non-stationary linear bandits. Concretely, given a fixed time budget $T\in \mathbb{N}$, finite arm set $\mathcal{X} \subset \mathbb{R}^d$, and a potentially adversarial sequence of unknown parameters $\lbrace θ_t\rbrace_{t=1}^{T}$ (hence non-stationary), a learner aims to identify the arm with the largest cumulative reward $x_* = \arg\max_{x \in \mathcal{X}} x^\top\sum_{t=1}^T θ_t$ with high probability. In this setting, it is well-known that uniformly sampling arms from the G-optimal design yields a minimax-optimal error probability of $\exp\left(-Θ\left(T / H_{G}\right)\right)$, where $H_{G}$ scales proportionally with the dimension $d$. However, this notion of complexity is overly pessimistic, as it is derived from a lower bound in which the arm set consists only of the standard basis vectors, thus masking any potential advantages arising from arm sets with richer geometric structure. To address this, we establish an arm-set-dependent lower bound that, in contrast, holds for any arm set. Motivated by the ideas underlying our lower bound, we propose the Adjacent-optimal design, a specialization of the well-known $\mathcal{X}\mathcal{Y}$-optimal design, and develop the $\textsf{Adjacent-BAI}$ algorithm. We prove that the error probability of $\textsf{Adjacent-BAI}$ matches our lower bound up to constants, verifying the tightness of our lower bound, and establishing the arm-set-dependent complexity of this setting.
title On The Complexity of Best-Arm Identification in Non-Stationary Linear Bandits
topic Machine Learning
url https://arxiv.org/abs/2603.10346