Saved in:
| Main Authors: | Lee, Joongkyu, Park, Seung Joon, Tang, Yunhao, Oh, Min-hwan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.05439 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Combinatorial Reinforcement Learning with Preference Feedback
by: Lee, Joongkyu, et al.
Published: (2025)
by: Lee, Joongkyu, et al.
Published: (2025)
Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
by: Lee, Joongkyu, et al.
Published: (2024)
by: Lee, Joongkyu, et al.
Published: (2024)
Nearly Minimax Optimal Regret for Multinomial Logistic Bandit
by: Lee, Joongkyu, et al.
Published: (2024)
by: Lee, Joongkyu, et al.
Published: (2024)
Improved Online Confidence Bounds for Multinomial Logistic Bandits
by: Lee, Joongkyu, et al.
Published: (2025)
by: Lee, Joongkyu, et al.
Published: (2025)
Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification
by: Lee, Joongkyu, et al.
Published: (2026)
by: Lee, Joongkyu, et al.
Published: (2026)
Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent
by: Lee, Joongkyu, et al.
Published: (2026)
by: Lee, Joongkyu, et al.
Published: (2026)
Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
by: Lee, Joongkyu, et al.
Published: (2025)
by: Lee, Joongkyu, et al.
Published: (2025)
Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation
by: Cho, Wooseong, et al.
Published: (2024)
by: Cho, Wooseong, et al.
Published: (2024)
Multi-Step Likelihood-Ratio Correction for Reinforcement Learning with Verifiable Rewards
by: Yoon, Deokgyu, et al.
Published: (2026)
by: Yoon, Deokgyu, et al.
Published: (2026)
Minimax Optimal Reinforcement Learning with Quasi-Optimism
by: Lee, Harin, et al.
Published: (2025)
by: Lee, Harin, et al.
Published: (2025)
Unified Framework of Distributional Regret in Multi-Armed Bandits and Reinforcement Learning
by: Lee, Harin, et al.
Published: (2026)
by: Lee, Harin, et al.
Published: (2026)
Symmetry-Aware GFlowNets
by: Kim, Hohyun, et al.
Published: (2025)
by: Kim, Hohyun, et al.
Published: (2025)
Improved Regret of Linear Ensemble Sampling
by: Lee, Harin, et al.
Published: (2024)
by: Lee, Harin, et al.
Published: (2024)
Infrequent Exploration in Linear Bandits
by: Lee, Harin, et al.
Published: (2025)
by: Lee, Harin, et al.
Published: (2025)
Model-Based Reinforcement Learning with Multinomial Logistic Function Approximation
by: Hwang, Taehyun, et al.
Published: (2022)
by: Hwang, Taehyun, et al.
Published: (2022)
Peng's Q($λ$) for Conservative Value Estimation in Offline Reinforcement Learning
by: Kim, Byeongchan, et al.
Published: (2026)
by: Kim, Byeongchan, et al.
Published: (2026)
Thompson Sampling for Multi-Objective Linear Contextual Bandit
by: Park, Somangchan, et al.
Published: (2025)
by: Park, Somangchan, et al.
Published: (2025)
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
by: Kang, Hyungkyu, et al.
Published: (2025)
by: Kang, Hyungkyu, et al.
Published: (2025)
Lasso Bandit with Compatibility Condition on Optimal Arm
by: Lee, Harin, et al.
Published: (2024)
by: Lee, Harin, et al.
Published: (2024)
Follow-the-Perturbed-Leader for Decoupled Bandits: Best-of-Both-Worlds and Practicality
by: Kim, Chaiwon, et al.
Published: (2025)
by: Kim, Chaiwon, et al.
Published: (2025)
Blessings of Multiple Good Arms in Multi-Objective Linear Bandits
by: Ann, Heesang, et al.
Published: (2026)
by: Ann, Heesang, et al.
Published: (2026)
Optimal and Practical Batched Linear Bandit Algorithm
by: Yu, Sanghoon, et al.
Published: (2025)
by: Yu, Sanghoon, et al.
Published: (2025)
Practical and Optimal Algorithm for Linear Contextual Bandits with Rare Parameter Updates
by: Yu, Sanghoon, et al.
Published: (2026)
by: Yu, Sanghoon, et al.
Published: (2026)
Latent Representation Alignment for Offline Goal-Conditioned Reinforcement Learning
by: Kang, Hyungkyu, et al.
Published: (2026)
by: Kang, Hyungkyu, et al.
Published: (2026)
Queueing Matching Bandits with Preference Feedback
by: Kim, Jung-hun, et al.
Published: (2024)
by: Kim, Jung-hun, et al.
Published: (2024)
Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit
by: Kim, Seok-Jin, et al.
Published: (2024)
by: Kim, Seok-Jin, et al.
Published: (2024)
Exploration via Feature Perturbation in Contextual Bandits
by: Yi, Seouh-won, et al.
Published: (2025)
by: Yi, Seouh-won, et al.
Published: (2025)
ADAM Optimization with Adaptive Batch Selection
by: Kim, Gyu Yeol, et al.
Published: (2025)
by: Kim, Gyu Yeol, et al.
Published: (2025)
Stochastic Matching Bandits with Rare Optimization Updates
by: Kim, Jung-hun, et al.
Published: (2025)
by: Kim, Jung-hun, et al.
Published: (2025)
Dynamic Assortment Selection and Pricing with Censored Preference Feedback
by: Kim, Jung-hun, et al.
Published: (2025)
by: Kim, Jung-hun, et al.
Published: (2025)
Doubly Perturbed Task Free Continual Learning
by: Lee, Byung Hyun, et al.
Published: (2023)
by: Lee, Byung Hyun, et al.
Published: (2023)
Follow-the-Perturbed-Leader with Fréchet-type Tail Distributions: Optimality in Adversarial Bandits and Best-of-Both-Worlds
by: Lee, Jongyeong, et al.
Published: (2024)
by: Lee, Jongyeong, et al.
Published: (2024)
Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
by: Lee, Jongyeong, et al.
Published: (2025)
by: Lee, Jongyeong, et al.
Published: (2025)
Convergence of Muon with Newton-Schulz
by: Kim, Gyu Yeol, et al.
Published: (2026)
by: Kim, Gyu Yeol, et al.
Published: (2026)
Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities
by: Hwang, Taehyun, et al.
Published: (2026)
by: Hwang, Taehyun, et al.
Published: (2026)
Linear Bandits with Partially Observable Features
by: Kim, Wonyoung, et al.
Published: (2025)
by: Kim, Wonyoung, et al.
Published: (2025)
Oracle-Efficient Combinatorial Semi-Bandits
by: Kim, Jung-hun, et al.
Published: (2025)
by: Kim, Jung-hun, et al.
Published: (2025)
Experimental Design for Semiparametric Bandits
by: Kim, Seok-Jin, et al.
Published: (2025)
by: Kim, Seok-Jin, et al.
Published: (2025)
Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks
by: Mucsányi, Bálint, et al.
Published: (2024)
by: Mucsányi, Bálint, et al.
Published: (2024)
Semantic-Aware Gaussian Process Calibration with Structured Layerwise Kernels for Deep Neural Networks
by: Lee, Kyung-hwan, et al.
Published: (2025)
by: Lee, Kyung-hwan, et al.
Published: (2025)
Similar Items
-
Combinatorial Reinforcement Learning with Preference Feedback
by: Lee, Joongkyu, et al.
Published: (2025) -
Demystifying Linear MDPs and Novel Dynamics Aggregation Framework
by: Lee, Joongkyu, et al.
Published: (2024) -
Nearly Minimax Optimal Regret for Multinomial Logistic Bandit
by: Lee, Joongkyu, et al.
Published: (2024) -
Improved Online Confidence Bounds for Multinomial Logistic Bandits
by: Lee, Joongkyu, et al.
Published: (2025) -
Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification
by: Lee, Joongkyu, et al.
Published: (2026)