:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lee, Joongkyu, Park, Seung Joon, Tang, Yunhao, Oh, Min-hwan
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2402.05439
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Combinatorial Reinforcement Learning with Preference Feedback
by: Lee, Joongkyu, et al.
Published: (2025)

Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
by: Lee, Joongkyu, et al.
Published: (2024)

Nearly Minimax Optimal Regret for Multinomial Logistic Bandit
by: Lee, Joongkyu, et al.
Published: (2024)

Improved Online Confidence Bounds for Multinomial Logistic Bandits
by: Lee, Joongkyu, et al.
Published: (2025)

Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification
by: Lee, Joongkyu, et al.
Published: (2026)

Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
by: Lee, Joongkyu, et al.
Published: (2026)

Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
by: Lee, Joongkyu, et al.
Published: (2025)

Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation
by: Cho, Wooseong, et al.
Published: (2024)

Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards
by: Yoon, Deokgyu, et al.
Published: (2026)

Minimax Optimal Reinforcement Learning with Quasi-Optimism
by: Lee, Harin, et al.
Published: (2025)

Unified Framework of Distributional Regret in Multi-Armed Bandits and Reinforcement Learning
by: Lee, Harin, et al.
Published: (2026)

Symmetry-Aware GFlowNets
by: Kim, Hohyun, et al.
Published: (2025)

Improved Regret of Linear Ensemble Sampling
by: Lee, Harin, et al.
Published: (2024)

Infrequent Exploration in Linear Bandits
by: Lee, Harin, et al.
Published: (2025)

Model-Based Reinforcement Learning with Multinomial Logistic Function Approximation
by: Hwang, Taehyun, et al.
Published: (2022)

Peng's Q($λ$) for Conservative Value Estimation in Offline Reinforcement Learning
by: Kim, Byeongchan, et al.
Published: (2026)

Thompson Sampling for Multi-Objective Linear Contextual Bandit
by: Park, Somangchan, et al.
Published: (2025)

Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
by: Kang, Hyungkyu, et al.
Published: (2025)

Lasso Bandit with Compatibility Condition on Optimal Arm
by: Lee, Harin, et al.
Published: (2024)

Follow-the-Perturbed-Leader for Decoupled Bandits: Best-of-Both-Worlds and Practicality
by: Kim, Chaiwon, et al.
Published: (2025)

Blessings of Multiple Good Arms in Multi-Objective Linear Bandits
by: Ann, Heesang, et al.
Published: (2026)

Optimal and Practical Batched Linear Bandit Algorithm
by: Yu, Sanghoon, et al.
Published: (2025)

Practical and Optimal Algorithm for Linear Contextual Bandits with Rare Parameter Updates
by: Yu, Sanghoon, et al.
Published: (2026)

Latent Representation Alignment for Offline Goal-Conditioned Reinforcement Learning
by: Kang, Hyungkyu, et al.
Published: (2026)

Queueing Matching Bandits with Preference Feedback
by: Kim, Jung-hun, et al.
Published: (2024)

Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit
by: Kim, Seok-Jin, et al.
Published: (2024)

Exploration via Feature Perturbation in Contextual Bandits
by: Yi, Seouh-won, et al.
Published: (2025)

ADAM Optimization with Adaptive Batch Selection
by: Kim, Gyu Yeol, et al.
Published: (2025)

Stochastic Matching Bandits with Rare Optimization Updates
by: Kim, Jung-hun, et al.
Published: (2025)

Dynamic Assortment Selection and Pricing with Censored Preference Feedback
by: Kim, Jung-hun, et al.
Published: (2025)

Doubly Perturbed Task Free Continual Learning
by: Lee, Byung Hyun, et al.
Published: (2023)

Follow-the-Perturbed-Leader with Fréchet-type Tail Distributions: Optimality in Adversarial Bandits and Best-of-Both-Worlds
by: Lee, Jongyeong, et al.
Published: (2024)

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
by: Lee, Jongyeong, et al.
Published: (2025)

Convergence of Muon with Newton-Schulz
by: Kim, Gyu Yeol, et al.
Published: (2026)

Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities
by: Hwang, Taehyun, et al.
Published: (2026)

Linear Bandits with Partially Observable Features
by: Kim, Wonyoung, et al.
Published: (2025)

Oracle-Efficient Combinatorial Semi-Bandits
by: Kim, Jung-hun, et al.
Published: (2025)

Experimental Design for Semiparametric Bandits
by: Kim, Seok-Jin, et al.
Published: (2025)

Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks
by: Mucsányi, Bálint, et al.
Published: (2024)

Semantic-Aware Gaussian Process Calibration with Structured Layerwise Kernels for Deep Neural Networks
by: Lee, Kyung-hwan, et al.
Published: (2025)