Saved in:
| Main Authors: | Bae, Seoungbin, Kang, Garyeong, Lee, Dabeen |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.19300 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning to Route and Schedule LLMs from User Retrials via Contextual Queueing Bandits
by: Bae, Seoungbin, et al.
Published: (2026)
by: Bae, Seoungbin, et al.
Published: (2026)
Logistic Bandits with $\tilde{O}(\sqrt{dT})$ Regret without Context Diversity Assumptions
by: Bae, Seoungbin, et al.
Published: (2026)
by: Bae, Seoungbin, et al.
Published: (2026)
Neural Logistic Bandits
by: Bae, Seoungbin, et al.
Published: (2025)
by: Bae, Seoungbin, et al.
Published: (2025)
Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses
by: Yu, Kihyun, et al.
Published: (2026)
by: Yu, Kihyun, et al.
Published: (2026)
Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
by: Yu, Kihyun, et al.
Published: (2026)
by: Yu, Kihyun, et al.
Published: (2026)
Chebyshev Center-Based Direction Selection for Multi-Objective Optimization and Training PINNs
by: Yoon, Hoyeol, et al.
Published: (2026)
by: Yoon, Hoyeol, et al.
Published: (2026)
Minimizing Queue Length Regret for Arbitrarily Varying Channels
by: Krishnakumar, G, et al.
Published: (2025)
by: Krishnakumar, G, et al.
Published: (2025)
Queueing Matching Bandits with Preference Feedback
by: Kim, Jung-hun, et al.
Published: (2024)
by: Kim, Jung-hun, et al.
Published: (2024)
Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism
by: Yu, Kihyun, et al.
Published: (2024)
by: Yu, Kihyun, et al.
Published: (2024)
Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
by: Yang, Zixian, et al.
Published: (2025)
by: Yang, Zixian, et al.
Published: (2025)
Variance-Dependent Regret Lower Bounds for Contextual Bandits
by: He, Jiafan, et al.
Published: (2025)
by: He, Jiafan, et al.
Published: (2025)
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
by: Di, Qiwei, et al.
Published: (2023)
by: Di, Qiwei, et al.
Published: (2023)
Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret
by: Yu, Kihyun, et al.
Published: (2026)
by: Yu, Kihyun, et al.
Published: (2026)
Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback
by: Levy, Orin, et al.
Published: (2025)
by: Levy, Orin, et al.
Published: (2025)
Reinforcement Learning Based Traffic Signal Design to Minimize Queue Lengths
by: Nandakumar, Anirud, et al.
Published: (2025)
by: Nandakumar, Anirud, et al.
Published: (2025)
Optimal Regret for Policy Optimization in Contextual Bandits
by: Levy, Orin, et al.
Published: (2026)
by: Levy, Orin, et al.
Published: (2026)
Fast Best-in-Class Regret for Contextual Bandits
by: Girard, Samuel, et al.
Published: (2025)
by: Girard, Samuel, et al.
Published: (2025)
Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion
by: Lee, Junghyun, et al.
Published: (2023)
by: Lee, Junghyun, et al.
Published: (2023)
Q-Net: Queue Length Estimation via Kalman-based Neural Networks
by: Gao, Ting, et al.
Published: (2025)
by: Gao, Ting, et al.
Published: (2025)
Design and Scheduling of an AI-based Queueing System
by: Lee, Jiung, et al.
Published: (2024)
by: Lee, Jiung, et al.
Published: (2024)
How Does Variance Shape the Regret in Contextual Bandits?
by: Jia, Zeyu, et al.
Published: (2024)
by: Jia, Zeyu, et al.
Published: (2024)
FedQueue: Queue-Aware Federated Learning for Cross-Facility HPC Training
by: Li, Yijiang, et al.
Published: (2026)
by: Li, Yijiang, et al.
Published: (2026)
Leveraging Queue Length and Attention Mechanisms for Enhanced Traffic Signal Control Optimization
by: Zhang, Liang, et al.
Published: (2021)
by: Zhang, Liang, et al.
Published: (2021)
Improved Regret Bounds for Bandits with Expert Advice
by: Cesa-Bianchi, Nicolò, et al.
Published: (2024)
by: Cesa-Bianchi, Nicolò, et al.
Published: (2024)
On the Optimal Regret of Locally Private Linear Contextual Bandit
by: Li, Jiachun, et al.
Published: (2024)
by: Li, Jiachun, et al.
Published: (2024)
Active Context Selection Improves Simple Regret in Contextual Bandits
by: Shahverdikondori, Mohammad, et al.
Published: (2026)
by: Shahverdikondori, Mohammad, et al.
Published: (2026)
Finite-Time Minimax Bounds and an Optimal Lyapunov Policy in Queueing Control
by: Liu, Yujie, et al.
Published: (2025)
by: Liu, Yujie, et al.
Published: (2025)
Parameter-Free Algorithms for Performative Regret Minimization under Decision-Dependent Distributions
by: Park, Sungwoo, et al.
Published: (2024)
by: Park, Sungwoo, et al.
Published: (2024)
Learning-Augmented Priority Queues
by: Benomar, Ziyad, et al.
Published: (2024)
by: Benomar, Ziyad, et al.
Published: (2024)
Information Capacity Regret Bounds for Bandits with Mediator Feedback
by: Eldowa, Khaled, et al.
Published: (2024)
by: Eldowa, Khaled, et al.
Published: (2024)
Regime-Calibrated Fleet Repositioning with a Spatial Queue-Regret Decomposition
by: Kumar, Indar, et al.
Published: (2026)
by: Kumar, Indar, et al.
Published: (2026)
Doubly-Bounded Queue for Constrained Online Learning: Keeping Pace with Dynamics of Both Loss and Constraint
by: Wang, Juncheng, et al.
Published: (2024)
by: Wang, Juncheng, et al.
Published: (2024)
Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms
by: Réveillard, William, et al.
Published: (2025)
by: Réveillard, William, et al.
Published: (2025)
Regret Bounds for Noise-Free Cascaded Kernelized Bandits
by: Li, Zihan, et al.
Published: (2022)
by: Li, Zihan, et al.
Published: (2022)
Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference
by: Jamshidi, Fateme, et al.
Published: (2025)
by: Jamshidi, Fateme, et al.
Published: (2025)
Near-optimal Per-Action Regret Bounds for Sleeping Bandits
by: Nguyen, Quan, et al.
Published: (2024)
by: Nguyen, Quan, et al.
Published: (2024)
Queue-based Eco-Driving at Roundabouts with Reinforcement Learning
by: Schlamp, Anna-Lena, et al.
Published: (2024)
by: Schlamp, Anna-Lena, et al.
Published: (2024)
The Transient Cost of Learning in Queueing Systems
by: Freund, Daniel, et al.
Published: (2023)
by: Freund, Daniel, et al.
Published: (2023)
Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit
by: Kim, Seok-Jin, et al.
Published: (2024)
by: Kim, Seok-Jin, et al.
Published: (2024)
Stochastic-Constrained Stochastic Optimization with Markovian Data
by: Kim, Yeongjong, et al.
Published: (2023)
by: Kim, Yeongjong, et al.
Published: (2023)
Similar Items
-
Learning to Route and Schedule LLMs from User Retrials via Contextual Queueing Bandits
by: Bae, Seoungbin, et al.
Published: (2026) -
Logistic Bandits with $\tilde{O}(\sqrt{dT})$ Regret without Context Diversity Assumptions
by: Bae, Seoungbin, et al.
Published: (2026) -
Neural Logistic Bandits
by: Bae, Seoungbin, et al.
Published: (2025) -
Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses
by: Yu, Kihyun, et al.
Published: (2026) -
Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
by: Yu, Kihyun, et al.
Published: (2026)