:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qin, Hao, Zhang, Chicheng
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.09456
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards
by: Qin, Hao, et al.
Published: (2023)

Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications
by: Qin, Hao, et al.
Published: (2025)

Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits
by: Jang, Kyoungseok, et al.
Published: (2024)

Group-Sensitive Offline Contextual Bandits
by: Guo, Yihong, et al.
Published: (2025)

Beyond Task Diversity: Provable Representation Transfer for Sequential Multi-Task Linear Bandits
by: Duong, Thang, et al.
Published: (2025)

Bridging Lifelong and Multi-Task Representation Learning via Algorithm and Complexity Measure
by: Wang, Zhi, et al.
Published: (2025)

Offline Contextual Bandits in the Presence of New Actions
by: Kishimoto, Ren, et al.
Published: (2026)

Offline Contextual Bandit with Counterfactual Sample Identification
by: Gilotte, Alexandre, et al.
Published: (2025)

Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff
by: Qian, Jian, et al.
Published: (2024)

Learning with Incomplete Context: Linear Contextual Bandits with Pretrained Imputation
by: Yan, Hao, et al.
Published: (2025)

Direction-Aware Offline-to-Online Learning in Linear Contextual Bandits
by: Han, Zean, et al.
Published: (2026)

Oracle-Efficient Combinatorial Semi-Bandits
by: Kim, Jung-hun, et al.
Published: (2025)

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach
by: Li, Yinan, et al.
Published: (2023)

Contextual Bandits for Unbounded Context Distributions
by: Zhao, Puning, et al.
Published: (2024)

Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling
by: Qin, Hao, et al.
Published: (2025)

Leveraging Offline Data in Linear Latent Contextual Bandits
by: Kausik, Chinmaya, et al.
Published: (2024)

Contextual Linear Bandits under Noisy Features: Towards Bayesian Oracles
by: Kim, Jung-hun, et al.
Published: (2017)

Constrained Contextual Bandits with Adversarial Contexts
by: Sarkar, Dhruv, et al.
Published: (2026)

Causal Contextual Bandits with Adaptive Context
by: Madhavan, Rahul, et al.
Published: (2024)

Efficient Contextual Bandits with Uninformed Feedback Graphs
by: Zhang, Mengxiao, et al.
Published: (2024)

Improved Offline Contextual Bandits with Second-Order Bounds: Betting and Freezing
by: Ryu, J. Jon, et al.
Published: (2025)

Towards Fundamental Limits for Active Multi-distribution Learning
by: Zhang, Chicheng, et al.
Published: (2025)

Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
by: Li, Yichen, et al.
Published: (2023)

Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning
by: Li, Yichen, et al.
Published: (2024)

Efficient Algorithms for Logistic Contextual Slate Bandits with Bandit Feedback
by: Goyal, Tanmay, et al.
Published: (2025)

The Sample Complexity of Multiclass and Sparse Contextual Bandits
by: Erez, Liad, et al.
Published: (2026)

Active Context Selection Improves Simple Regret in Contextual Bandits
by: Shahverdikondori, Mohammad, et al.
Published: (2026)

Optimizing Warfarin Dosing Using Contextual Bandit: An Offline Policy Learning and Evaluation Method
by: Huang, Yong, et al.
Published: (2024)

Offline Constrained RLHF with Multiple Preference Oracles
by: Latham, Brenden, et al.
Published: (2026)

Efficient Generalized Low-Rank Tensor Contextual Bandits
by: Yi, Qianxin, et al.
Published: (2023)

Context-Action Embedding Learning for Off-Policy Evaluation in Contextual Bandits
by: Chandak, Kushagra, et al.
Published: (2025)

Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits
by: Zhao, Qingyue, et al.
Published: (2025)

Efficient Adversarial Attacks on High-dimensional Offline Bandits
by: Hosseini, Seyed Mohammad Hadi, et al.
Published: (2026)

IBCB: Efficient Inverse Batched Contextual Bandit for Behavioral Evolution History
by: Xu, Yi, et al.
Published: (2024)

Contextual Linear Bandits with Delay as Payoff
by: Zhang, Mengxiao, et al.
Published: (2025)

Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
by: Duong, Thang, et al.
Published: (2025)

High Probability Bound for Cross-Learning Contextual Bandits with Unknown Context Distributions
by: Huang, Ruiyuan, et al.
Published: (2024)

Recycling History: Efficient Recommendations from Contextual Dueling Bandits
by: Sankagiri, Suryanarayana, et al.
Published: (2025)

Bayesian Regret Minimization in Offline Bandits
by: Petrik, Marek, et al.
Published: (2023)

Sparse Nonparametric Contextual Bandits
by: Flynn, Hamish, et al.
Published: (2025)