:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Ruibo, Wang, Jiazhou, Mullhaupt, Andrew
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2404.10207
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

UCB algorithms for multi-armed bandits: Precise regret and adaptive inference
by: Han, Qiyang, et al.
Published: (2024)

Functional multi-armed bandit and the best function identification problems
by: Dorn, Yuriy, et al.
Published: (2025)

Fast UCB-type algorithms for stochastic bandits with heavy and super heavy symmetric noise
by: Dorn, Yuriy, et al.
Published: (2024)

Bounding Neyman-Pearson Region with $f$-Divergences
by: Mullhaupt, Andrew, et al.
Published: (2025)

Leveraging priors on distribution functions for multi-arm bandits
by: Vashishtha, Sumit, et al.
Published: (2025)

Trading off rewards and errors in multi-armed bandits
by: Erraqabi, Akram, et al.
Published: (2026)

Minimax-optimal trust-aware multi-armed bandits
by: Cai, Changxiao, et al.
Published: (2024)

Information maximization for a broad variety of multi-armed bandit games
by: Barbier-Chebbah, Alex, et al.
Published: (2025)

Spectral bandits for smooth graph functions with applications in recommender systems
by: Kocák, Tomáš, et al.
Published: (2026)

Quantum contextual bandits and recommender systems for quantum data
by: Brahmachari, Shrigyan, et al.
Published: (2023)

Extended UCB Policies for Multi-armed Bandit Problems
by: Liu, Keqin, et al.
Published: (2011)

Softmax gradient policy for variance minimization and risk-averse multi armed bandits
by: Turinici, Gabriel
Published: (2026)

Efficient learning by implicit exploration in bandit problems with side observations
by: Kocak, Tomas, et al.
Published: (2026)

Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling
by: Qin, Hao, et al.
Published: (2025)

A more efficient method for large-sample model-free feature screening via multi-armed bandits
by: Ouyang, Xiaxue, et al.
Published: (2025)

Using causal abstractions to accelerate decision-making in complex bandit problems
by: Dyer, Joel, et al.
Published: (2025)

Offline-to-online hyperparameter transfer for stochastic bandits
by: Sharma, Dravyansh, et al.
Published: (2025)

Prior-informed optimization of treatment recommendation via bandit algorithms trained on large language model-processed historical records
by: Nessari, Saman, et al.
Published: (2025)

A survey on multi-player bandits
by: Boursier, Etienne, et al.
Published: (2022)

A single algorithm for both restless and rested rotting bandits
by: Seznec, Julien, et al.
Published: (2026)

Efficient kernelized bandit algorithms via exploration distributions
by: Hu, Bingshan, et al.
Published: (2025)

Covariance-adapting algorithm for semi-bandits with application to sparse rewards
by: Perrault, Pierre, et al.
Published: (2026)

Extreme bandits
by: Carpentier, Alexandra, et al.
Published: (2026)

Unified theory of upper confidence bound policies for bandit problems targeting total reward, maximal reward, and more
by: Kikkawa, Nobuaki, et al.
Published: (2024)

Solving cold start in news recommendations: a RippleNet-based system for large scale media outlet
by: Radziszewski, Karol, et al.
Published: (2025)

Solving multi-armed bandit problems using a chaotic microresonator comb
by: Cuevas, Jonathan, et al.
Published: (2023)

Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
by: Mei, Jincheng, et al.
Published: (2025)

Heuristic algorithms for the stochastic critical node detection problem
by: Bayarsaikhan, Tuguldur, et al.
Published: (2025)

Spectral bandits
by: Kocák, Tomáš, et al.
Published: (2026)

On the Suboptimality of GP-UCB under Polynomial Effective Optimism
by: Wang, Wenjia, et al.
Published: (2023)

Replicable Bandits with UCB based Exploration
by: Deb, Rohan, et al.
Published: (2026)

Adaptive political surveys and GPT-4: Tackling the cold start problem with simulated user interactions
by: Bachmann, Fynn, et al.
Published: (2025)

A characterization of sample adaptivity in UCB data
by: Chen, Yilun, et al.
Published: (2025)

On the optimal regret of collaborative personalized linear bandits
by: Huang, Bruce, et al.
Published: (2025)

Active clustering with bandit feedback
by: Thuot, Victor, et al.
Published: (2024)

Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment
by: Hu, Songqiao, et al.
Published: (2025)

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
by: Qiu, Shuang, et al.
Published: (2022)

An accelerated first-order regularized momentum descent ascent algorithm for stochastic nonconvex-concave minimax problems
by: Zhang, Huiling, et al.
Published: (2023)

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations
by: Huang, Ruiquan, et al.
Published: (2023)

Langevin dynamics based algorithm e-TH$\varepsilon$O POULA for stochastic optimization problems with discontinuous stochastic gradient
by: Lim, Dong-Young, et al.
Published: (2022)