Saved in:
| Main Authors: | Wang, Wenjia, Zhang, Xiaowei |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.01386 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Polynomial Regret Concentration of UCB for Non-Deterministic State Transitions
by: Cömer, Can, et al.
Published: (2025)
by: Cömer, Can, et al.
Published: (2025)
SPARKLE: A Nonparametric Approach for Online Decision-Making with High-Dimensional Covariates
by: Wang, Wenjia, et al.
Published: (2025)
by: Wang, Wenjia, et al.
Published: (2025)
Statistical Inference under Adaptive Sampling with LinUCB
by: Fan, Wei, et al.
Published: (2025)
by: Fan, Wei, et al.
Published: (2025)
Replicable Bandits with UCB based Exploration
by: Deb, Rohan, et al.
Published: (2026)
by: Deb, Rohan, et al.
Published: (2026)
Tractable Instances of Bilinear Maximization: Implementing LinUCB on Ellipsoids
by: Zhang, Raymond, et al.
Published: (2025)
by: Zhang, Raymond, et al.
Published: (2025)
Directional Optimism for Safe Linear Bandits
by: Hutchinson, Spencer, et al.
Published: (2023)
by: Hutchinson, Spencer, et al.
Published: (2023)
A characterization of sample adaptivity in UCB data
by: Chen, Yilun, et al.
Published: (2025)
by: Chen, Yilun, et al.
Published: (2025)
Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Data
by: van Remmerden, Jesse, et al.
Published: (2025)
by: van Remmerden, Jesse, et al.
Published: (2025)
Truncated LinUCB for Stochastic Linear Bandits
by: Song, Yanglei, et al.
Published: (2022)
by: Song, Yanglei, et al.
Published: (2022)
On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs
by: Dong, Zixuan, et al.
Published: (2022)
by: Dong, Zixuan, et al.
Published: (2022)
Minimax Optimal Reinforcement Learning with Quasi-Optimism
by: Lee, Harin, et al.
Published: (2025)
by: Lee, Harin, et al.
Published: (2025)
Beyond Optimism: Exploration With Partially Observable Rewards
by: Parisi, Simone, et al.
Published: (2024)
by: Parisi, Simone, et al.
Published: (2024)
UCB Exploration for Fixed-Budget Bayesian Best Arm Identification
by: Zhu, Rong J. B., et al.
Published: (2024)
by: Zhu, Rong J. B., et al.
Published: (2024)
UCB for Large-Scale Pure Exploration: Beyond Sub-Gaussianity
by: Li, Zaile, et al.
Published: (2025)
by: Li, Zaile, et al.
Published: (2025)
Clus-UCB: A Near-Optimal Algorithm for Clustered Bandits
by: Gore, Aakash, et al.
Published: (2025)
by: Gore, Aakash, et al.
Published: (2025)
Graph Learning Is Suboptimal in Causal Bandits
by: Shahverdikondori, Mohammad, et al.
Published: (2025)
by: Shahverdikondori, Mohammad, et al.
Published: (2025)
Cooperative Multi-Agent Graph Bandits: UCB Algorithm and Regret Analysis
by: Paschalidis, Phevos, et al.
Published: (2024)
by: Paschalidis, Phevos, et al.
Published: (2024)
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
by: Qiu, Shuang, et al.
Published: (2022)
by: Qiu, Shuang, et al.
Published: (2022)
A Spatially Informed Gaussian Process UCB Method for Decentralized Coverage Control
by: Guidone, Gennaro, et al.
Published: (2025)
by: Guidone, Gennaro, et al.
Published: (2025)
Suboptimal Shapley Value Explanations
by: Lu, Xiaolei
Published: (2025)
by: Lu, Xiaolei
Published: (2025)
UCB-type Algorithm for Budget-Constrained Expert Learning
by: Latypov, Ilgam, et al.
Published: (2025)
by: Latypov, Ilgam, et al.
Published: (2025)
Precise Asymptotics and Refined Regret of Variance-Aware UCB
by: Fan, Yingying, et al.
Published: (2024)
by: Fan, Yingying, et al.
Published: (2024)
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits
by: Li, Mengmeng, et al.
Published: (2024)
by: Li, Mengmeng, et al.
Published: (2024)
DROP: Distributional and Regular Optimism and Pessimism for Reinforcement Learning
by: Kobayashi, Taisuke
Published: (2024)
by: Kobayashi, Taisuke
Published: (2024)
Provably Efficient UCB-type Algorithms For Learning Predictive State Representations
by: Huang, Ruiquan, et al.
Published: (2023)
by: Huang, Ruiquan, et al.
Published: (2023)
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning
by: Shi, Yucheng, et al.
Published: (2024)
by: Shi, Yucheng, et al.
Published: (2024)
Revisiting Social Welfare in Bandits: UCB is (Nearly) All You Need
by: Sarkar, Dhruv, et al.
Published: (2025)
by: Sarkar, Dhruv, et al.
Published: (2025)
DAK-UCB: Diversity-Aware Prompt Routing for LLMs and Generative Models
by: Jafari, Donya, et al.
Published: (2026)
by: Jafari, Donya, et al.
Published: (2026)
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
by: Bui, Ha Manh, et al.
Published: (2024)
by: Bui, Ha Manh, et al.
Published: (2024)
Extended UCB Policies for Multi-armed Bandit Problems
by: Liu, Keqin, et al.
Published: (2011)
by: Liu, Keqin, et al.
Published: (2011)
Attack-Resistant Uniform Fairness for Linear and Smooth Contextual Bandits
by: Zhang, Qingwen, et al.
Published: (2026)
by: Zhang, Qingwen, et al.
Published: (2026)
Reward-Based Online LLM Routing via NeuralUCB
by: Tsai, Ming-Hua, et al.
Published: (2026)
by: Tsai, Ming-Hua, et al.
Published: (2026)
Minimizing UCB: a Better Local Search Strategy in Local Bayesian Optimization
by: Fan, Zheyi, et al.
Published: (2024)
by: Fan, Zheyi, et al.
Published: (2024)
Optimism as Risk-Seeking in Multi-Agent Reinforcement Learning
by: Zhang, Runyu, et al.
Published: (2025)
by: Zhang, Runyu, et al.
Published: (2025)
Source-Optimal Training is Transfer-Suboptimal
by: Hedges, C. Evans
Published: (2025)
by: Hedges, C. Evans
Published: (2025)
On the Dynamic Regret of Following the Regularized Leader: Optimism with History Pruning
by: Mhaisen, Naram, et al.
Published: (2025)
by: Mhaisen, Naram, et al.
Published: (2025)
UCB algorithms for multi-armed bandits: Precise regret and adaptive inference
by: Han, Qiyang, et al.
Published: (2024)
by: Han, Qiyang, et al.
Published: (2024)
Connecting Thompson Sampling and UCB: Towards More Efficient Trade-offs Between Privacy and Regret
by: Hu, Bingshan, et al.
Published: (2025)
by: Hu, Bingshan, et al.
Published: (2025)
Online (Non-)Convex Learning via Tempered Optimism
by: Haddouche, Maxime, et al.
Published: (2023)
by: Haddouche, Maxime, et al.
Published: (2023)
Implicit Riemannian Optimism with Applications to Min-Max Problems
by: Roux, Christophe, et al.
Published: (2025)
by: Roux, Christophe, et al.
Published: (2025)
Similar Items
-
Polynomial Regret Concentration of UCB for Non-Deterministic State Transitions
by: Cömer, Can, et al.
Published: (2025) -
SPARKLE: A Nonparametric Approach for Online Decision-Making with High-Dimensional Covariates
by: Wang, Wenjia, et al.
Published: (2025) -
Statistical Inference under Adaptive Sampling with LinUCB
by: Fan, Wei, et al.
Published: (2025) -
Replicable Bandits with UCB based Exploration
by: Deb, Rohan, et al.
Published: (2026) -
Tractable Instances of Bilinear Maximization: Implementing LinUCB on Ellipsoids
by: Zhang, Raymond, et al.
Published: (2025)