Saved in:
Bibliographic Details
Main Author: Salem, Tareq Si
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.19867
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917225686892544
author Salem, Tareq Si
author_facet Salem, Tareq Si
contents We investigate the challenging problem of adversarial multi-armed bandits operating under time-varying constraints, a scenario motivated by numerous real-world applications. To address this complex setting, we propose a novel primal-dual algorithm that extends online mirror descent through the incorporation of suitable gradient estimators and effective constraint handling. We provide theoretical guarantees establishing sublinear dynamic regret and sublinear constraint violation for our proposed policy. Our algorithm achieves state-of-the-art performance in terms of both regret and constraint violation. Empirical evaluations demonstrate the superiority of our approach.
format Preprint
id arxiv_https___arxiv_org_abs_2601_19867
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Bandits in Flux: Adversarial Constraints in Dynamic Environments
Salem, Tareq Si
Machine Learning
We investigate the challenging problem of adversarial multi-armed bandits operating under time-varying constraints, a scenario motivated by numerous real-world applications. To address this complex setting, we propose a novel primal-dual algorithm that extends online mirror descent through the incorporation of suitable gradient estimators and effective constraint handling. We provide theoretical guarantees establishing sublinear dynamic regret and sublinear constraint violation for our proposed policy. Our algorithm achieves state-of-the-art performance in terms of both regret and constraint violation. Empirical evaluations demonstrate the superiority of our approach.
title Bandits in Flux: Adversarial Constraints in Dynamic Environments
topic Machine Learning
url https://arxiv.org/abs/2601.19867