:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ouyang, Xiaxue, Kang, Xinlai, Li, Mengyu, Dou, Zhenxing, Yu, Jun, Meng, Cheng
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Computation
Online Access:	https://arxiv.org/abs/2509.16085
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Leveraging priors on distribution functions for multi-arm bandits
by: Vashishtha, Sumit, et al.
Published: (2025)

Trading off rewards and errors in multi-armed bandits
by: Erraqabi, Akram, et al.
Published: (2026)

Minimax-optimal trust-aware multi-armed bandits
by: Cai, Changxiao, et al.
Published: (2024)

Functional multi-armed bandit and the best function identification problems
by: Dorn, Yuriy, et al.
Published: (2025)

Information maximization for a broad variety of multi-armed bandit games
by: Barbier-Chebbah, Alex, et al.
Published: (2025)

Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling
by: Qin, Hao, et al.
Published: (2025)

UCB algorithms for multi-armed bandits: Precise regret and adaptive inference
by: Han, Qiyang, et al.
Published: (2024)

Softmax gradient policy for variance minimization and risk-averse multi armed bandits
by: Turinici, Gabriel
Published: (2026)

HELLINGER-UCB: A novel algorithm for stochastic multi-armed bandit problem and cold start problem in recommender system
by: Yang, Ruibo, et al.
Published: (2024)

A survey on multi-player bandits
by: Boursier, Etienne, et al.
Published: (2022)

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression
by: Cen, Rui, et al.
Published: (2026)

Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
by: Mei, Jincheng, et al.
Published: (2025)

Importance Sparsification for Sinkhorn Algorithm
by: Li, Mengyu, et al.
Published: (2023)

Continuous-time multi-armed bandits under random intervention times
by: Noba, Kei, et al.
Published: (2026)

Extreme bandits
by: Carpentier, Alexandra, et al.
Published: (2026)

Prior-informed optimization of treatment recommendation via bandit algorithms trained on large language model-processed historical records
by: Nessari, Saman, et al.
Published: (2025)

Asymptotic properties of a multicolored random reinforced urn model with an application to multi-armed bandits
by: Yang, Li, et al.
Published: (2024)

Spectral bandits
by: Kocák, Tomáš, et al.
Published: (2026)

Adversarial bandit optimization for approximately linear functions
by: Cheng, Zhuoyu, et al.
Published: (2025)

Solving multi-armed bandit problems using a chaotic microresonator comb
by: Cuevas, Jonathan, et al.
Published: (2023)

Unified theory of upper confidence bound policies for bandit problems targeting total reward, maximal reward, and more
by: Kikkawa, Nobuaki, et al.
Published: (2024)

Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment
by: Hu, Songqiao, et al.
Published: (2025)

Efficient kernelized bandit algorithms via exploration distributions
by: Hu, Bingshan, et al.
Published: (2025)

Active clustering with bandit feedback
by: Thuot, Victor, et al.
Published: (2024)

Information-directed sampling for bandits: a primer
by: Hirling, Annika, et al.
Published: (2025)

Ensemble sampling for linear bandits: small ensembles suffice
by: Janz, David, et al.
Published: (2023)

Regularization can make diffusion models more efficient
by: Taheri, Mahsa, et al.
Published: (2025)

Positive region preserved random sampling: an efficient feature selection method for massive data
by: Bai, Hexiang, et al.
Published: (2025)

Instance-dependent Stochastic Lipschitz bandit
by: Potfer, Marius, et al.
Published: (2026)

Spectral bandits for smooth graph functions
by: Valko, Michal, et al.
Published: (2026)

Approximate information maximization for bandit games
by: Barbier-Chebbah, Alex, et al.
Published: (2023)

Online learning in bandits with predicted context
by: Guo, Yongyi, et al.
Published: (2023)

Redundant feature screening method for human activity recognition based on attention purification mechanism
by: Li, Xiaoyang, et al.
Published: (2025)

Sequential query prediction based on multi‑armed bandits with ensemble of transformer experts and immediate feedback
by: Puthiya Parambath, Shameem A, et al.
Published: (2024)

Multi‐armed bandit based online model selection for concept‐drift adaptation
by: Jobin Wilson, et al.
Published: (2024)

Variance-sensitive Thompson sampling for generalised linear bandits, revisited
by: Perneczky, Tom, et al.
Published: (2026)

Risk and optimal policies in bandit experiments
by: Adusumilli, Karun
Published: (2021)

Linear bandits with polylogarithmic minimax regret
by: Lumbreras, Josep, et al.
Published: (2024)

Revealing graph bandits for maximizing local influence
by: Carpentier, Alexandra, et al.
Published: (2026)

On the optimal regret of collaborative personalized linear bandits
by: Huang, Bruce, et al.
Published: (2025)