:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Wenjia, Zhang, Xiaowei
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2312.01386
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Polynomial Regret Concentration of UCB for Non-Deterministic State Transitions
by: Cömer, Can, et al.
Published: (2025)

SPARKLE: A Nonparametric Approach for Online Decision-Making with High-Dimensional Covariates
by: Wang, Wenjia, et al.
Published: (2025)

Statistical Inference under Adaptive Sampling with LinUCB
by: Fan, Wei, et al.
Published: (2025)

Replicable Bandits with UCB based Exploration
by: Deb, Rohan, et al.
Published: (2026)

Tractable Instances of Bilinear Maximization: Implementing LinUCB on Ellipsoids
by: Zhang, Raymond, et al.
Published: (2025)

Directional Optimism for Safe Linear Bandits
by: Hutchinson, Spencer, et al.
Published: (2023)

A characterization of sample adaptivity in UCB data
by: Chen, Yilun, et al.
Published: (2025)

Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Data
by: van Remmerden, Jesse, et al.
Published: (2025)

Truncated LinUCB for Stochastic Linear Bandits
by: Song, Yanglei, et al.
Published: (2022)

On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs
by: Dong, Zixuan, et al.
Published: (2022)

Minimax Optimal Reinforcement Learning with Quasi-Optimism
by: Lee, Harin, et al.
Published: (2025)

Beyond Optimism: Exploration With Partially Observable Rewards
by: Parisi, Simone, et al.
Published: (2024)

UCB Exploration for Fixed-Budget Bayesian Best Arm Identification
by: Zhu, Rong J. B., et al.
Published: (2024)

UCB for Large-Scale Pure Exploration: Beyond Sub-Gaussianity
by: Li, Zaile, et al.
Published: (2025)

Clus-UCB: A Near-Optimal Algorithm for Clustered Bandits
by: Gore, Aakash, et al.
Published: (2025)

Graph Learning Is Suboptimal in Causal Bandits
by: Shahverdikondori, Mohammad, et al.
Published: (2025)

Cooperative Multi-Agent Graph Bandits: UCB Algorithm and Regret Analysis
by: Paschalidis, Phevos, et al.
Published: (2024)

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
by: Qiu, Shuang, et al.
Published: (2022)

A Spatially Informed Gaussian Process UCB Method for Decentralized Coverage Control
by: Guidone, Gennaro, et al.
Published: (2025)

Suboptimal Shapley Value Explanations
by: Lu, Xiaolei
Published: (2025)

UCB-type Algorithm for Budget-Constrained Expert Learning
by: Latypov, Ilgam, et al.
Published: (2025)

Precise Asymptotics and Refined Regret of Variance-Aware UCB
by: Fan, Yingying, et al.
Published: (2024)

Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits
by: Li, Mengmeng, et al.
Published: (2024)

DROP: Distributional and Regular Optimism and Pessimism for Reinforcement Learning
by: Kobayashi, Taisuke
Published: (2024)

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations
by: Huang, Ruiquan, et al.
Published: (2023)

UCB-driven Utility Function Search for Multi-objective Reinforcement Learning
by: Shi, Yucheng, et al.
Published: (2024)

Revisiting Social Welfare in Bandits: UCB is (Nearly) All You Need
by: Sarkar, Dhruv, et al.
Published: (2025)

DAK-UCB: Diversity-Aware Prompt Routing for LLMs and Generative Models
by: Jafari, Donya, et al.
Published: (2026)

Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
by: Bui, Ha Manh, et al.
Published: (2024)

Extended UCB Policies for Multi-armed Bandit Problems
by: Liu, Keqin, et al.
Published: (2011)

Attack-Resistant Uniform Fairness for Linear and Smooth Contextual Bandits
by: Zhang, Qingwen, et al.
Published: (2026)

Reward-Based Online LLM Routing via NeuralUCB
by: Tsai, Ming-Hua, et al.
Published: (2026)

Minimizing UCB: a Better Local Search Strategy in Local Bayesian Optimization
by: Fan, Zheyi, et al.
Published: (2024)

Optimism as Risk-Seeking in Multi-Agent Reinforcement Learning
by: Zhang, Runyu, et al.
Published: (2025)

Source-Optimal Training is Transfer-Suboptimal
by: Hedges, C. Evans
Published: (2025)

On the Dynamic Regret of Following the Regularized Leader: Optimism with History Pruning
by: Mhaisen, Naram, et al.
Published: (2025)

UCB algorithms for multi-armed bandits: Precise regret and adaptive inference
by: Han, Qiyang, et al.
Published: (2024)

Connecting Thompson Sampling and UCB: Towards More Efficient Trade-offs Between Privacy and Regret
by: Hu, Bingshan, et al.
Published: (2025)

Online (Non-)Convex Learning via Tempered Optimism
by: Haddouche, Maxime, et al.
Published: (2023)

Implicit Riemannian Optimism with Applications to Min-Max Problems
by: Roux, Christophe, et al.
Published: (2025)