Inhaltsangabe: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zeng, Qirun, Wang, Xuchuang, Shen, Jiayi, Liu, Xutong, Kong, Fang, Zuo, Jinhang
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2605.05745
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Inhaltsangabe:

We study fixed-confidence best arm identification in generalized linear bandits under a hybrid feedback model: at each round, the learner may query either (i) absolute reward feedback from a single arm or (ii) relative (dueling) feedback from an arm pair, both governed by generalized linear models. We introduce a likelihood-ratio--based confidence sequence that unifies heterogeneous generalized linear observations and yields an explicit ellipsoidal confidence set under a self-concordance assumption. Building on this confidence set, we propose a hybrid Track-and-Stop algorithm that adaptively allocates queries by tracking a minimax-optimal design over a joint action space of arms and pairs. We establish $δ$-correctness and provide high-probability upper bounds on the stopping time. We further extend the framework to a cost-aware setting that accounts for heterogeneous acquisition costs across feedback modalities. Empirical experiments demonstrate that the proposed algorithms significantly improve sample efficiency over baseline methods.

Ähnliche Einträge