:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bae, Seoungbin, Kang, Garyeong, Lee, Dabeen
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.19300
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning to Route and Schedule LLMs from User Retrials via Contextual Queueing Bandits
by: Bae, Seoungbin, et al.
Published: (2026)

Logistic Bandits with $\tilde{O}(\sqrt{dT})$ Regret without Context Diversity Assumptions
by: Bae, Seoungbin, et al.
Published: (2026)

Neural Logistic Bandits
by: Bae, Seoungbin, et al.
Published: (2025)

Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses
by: Yu, Kihyun, et al.
Published: (2026)

Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
by: Yu, Kihyun, et al.
Published: (2026)

Chebyshev Center-Based Direction Selection for Multi-Objective Optimization and Training PINNs
by: Yoon, Hoyeol, et al.
Published: (2026)

Minimizing Queue Length Regret for Arbitrarily Varying Channels
by: Krishnakumar, G, et al.
Published: (2025)

Queueing Matching Bandits with Preference Feedback
by: Kim, Jung-hun, et al.
Published: (2024)

Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism
by: Yu, Kihyun, et al.
Published: (2024)

Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
by: Yang, Zixian, et al.
Published: (2025)

Variance-Dependent Regret Lower Bounds for Contextual Bandits
by: He, Jiafan, et al.
Published: (2025)

Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
by: Di, Qiwei, et al.
Published: (2023)

Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret
by: Yu, Kihyun, et al.
Published: (2026)

Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback
by: Levy, Orin, et al.
Published: (2025)

Reinforcement Learning Based Traffic Signal Design to Minimize Queue Lengths
by: Nandakumar, Anirud, et al.
Published: (2025)

Optimal Regret for Policy Optimization in Contextual Bandits
by: Levy, Orin, et al.
Published: (2026)

Fast Best-in-Class Regret for Contextual Bandits
by: Girard, Samuel, et al.
Published: (2025)

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion
by: Lee, Junghyun, et al.
Published: (2023)

Q-Net: Queue Length Estimation via Kalman-based Neural Networks
by: Gao, Ting, et al.
Published: (2025)

Design and Scheduling of an AI-based Queueing System
by: Lee, Jiung, et al.
Published: (2024)

How Does Variance Shape the Regret in Contextual Bandits?
by: Jia, Zeyu, et al.
Published: (2024)

FedQueue: Queue-Aware Federated Learning for Cross-Facility HPC Training
by: Li, Yijiang, et al.
Published: (2026)

Leveraging Queue Length and Attention Mechanisms for Enhanced Traffic Signal Control Optimization
by: Zhang, Liang, et al.
Published: (2021)

Improved Regret Bounds for Bandits with Expert Advice
by: Cesa-Bianchi, Nicolò, et al.
Published: (2024)

On the Optimal Regret of Locally Private Linear Contextual Bandit
by: Li, Jiachun, et al.
Published: (2024)

Active Context Selection Improves Simple Regret in Contextual Bandits
by: Shahverdikondori, Mohammad, et al.
Published: (2026)

Finite-Time Minimax Bounds and an Optimal Lyapunov Policy in Queueing Control
by: Liu, Yujie, et al.
Published: (2025)

Parameter-Free Algorithms for Performative Regret Minimization under Decision-Dependent Distributions
by: Park, Sungwoo, et al.
Published: (2024)

Learning-Augmented Priority Queues
by: Benomar, Ziyad, et al.
Published: (2024)

Information Capacity Regret Bounds for Bandits with Mediator Feedback
by: Eldowa, Khaled, et al.
Published: (2024)

Regime-Calibrated Fleet Repositioning with a Spatial Queue-Regret Decomposition
by: Kumar, Indar, et al.
Published: (2026)

Doubly-Bounded Queue for Constrained Online Learning: Keeping Pace with Dynamics of Both Loss and Constraint
by: Wang, Juncheng, et al.
Published: (2024)

Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms
by: Réveillard, William, et al.
Published: (2025)

Regret Bounds for Noise-Free Cascaded Kernelized Bandits
by: Li, Zihan, et al.
Published: (2022)

Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference
by: Jamshidi, Fateme, et al.
Published: (2025)

Near-optimal Per-Action Regret Bounds for Sleeping Bandits
by: Nguyen, Quan, et al.
Published: (2024)

Queue-based Eco-Driving at Roundabouts with Reinforcement Learning
by: Schlamp, Anna-Lena, et al.
Published: (2024)

The Transient Cost of Learning in Queueing Systems
by: Freund, Daniel, et al.
Published: (2023)

Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit
by: Kim, Seok-Jin, et al.
Published: (2024)

Stochastic-Constrained Stochastic Optimization with Markovian Data
by: Kim, Yeongjong, et al.
Published: (2023)