Zapisane w:
| Główni autorzy: | Tsuchiya, Taira, Ito, Shinji, Honda, Junya |
|---|---|
| Format: | Preprint |
| Wydane: |
2024
|
| Hasła przedmiotowe: | |
| Dostęp online: | https://arxiv.org/abs/2402.08321 |
| Etykiety: |
Dodaj etykietę
Nie ma etykietki, Dołącz pierwszą etykiete!
|
Podobne zapisy
Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds
od: Ito, Shinji, i wsp.
Wydane: (2024)
od: Ito, Shinji, i wsp.
Wydane: (2024)
Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds
od: Tsuchiya, Taira, i wsp.
Wydane: (2023)
od: Tsuchiya, Taira, i wsp.
Wydane: (2023)
A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ(T^{2/3})$ and its Application to Best-of-Both-Worlds
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)
Adversarial Learning in Games with Bandit Feedback: Logarithmic Pure-Strategy Maximin Regret
od: Ito, Shinji, i wsp.
Wydane: (2026)
od: Ito, Shinji, i wsp.
Wydane: (2026)
Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)
Reinforcement Learning from Adversarial Preferences in Tabular MDPs
od: Tsuchiya, Taira, i wsp.
Wydane: (2025)
od: Tsuchiya, Taira, i wsp.
Wydane: (2025)
Online Inverse Linear Optimization: Efficient Logarithmic-Regret Algorithm, Robustness to Suboptimality, and Lower Bound
od: Sakaue, Shinsaku, i wsp.
Wydane: (2025)
od: Sakaue, Shinsaku, i wsp.
Wydane: (2025)
Follow-the-Perturbed-Leader with Fréchet-type Tail Distributions: Optimality in Adversarial Bandits and Best-of-Both-Worlds
od: Lee, Jongyeong, i wsp.
Wydane: (2024)
od: Lee, Jongyeong, i wsp.
Wydane: (2024)
Learning with Posterior Sampling for Revenue Management under Time-varying Demand
od: Shimizu, Kazuma, i wsp.
Wydane: (2024)
od: Shimizu, Kazuma, i wsp.
Wydane: (2024)
Adapting to Stochastic and Adversarial Losses in Episodic MDPs with Aggregate Bandit Feedback
od: Ito, Shinji, i wsp.
Wydane: (2025)
od: Ito, Shinji, i wsp.
Wydane: (2025)
Corrupted Learning Dynamics in Games
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)
Scale-Invariant Fast Convergence in Games
od: Tsuchiya, Taira, i wsp.
Wydane: (2026)
od: Tsuchiya, Taira, i wsp.
Wydane: (2026)
Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback
od: Ito, Shinji, i wsp.
Wydane: (2025)
od: Ito, Shinji, i wsp.
Wydane: (2025)
Optimal Regret of Bernoulli Bandits under Global Differential Privacy
od: Azize, Achraf, i wsp.
Wydane: (2025)
od: Azize, Achraf, i wsp.
Wydane: (2025)
Tight Regret Upper and Lower Bounds for Optimistic Hedge in Two-Player Zero-Sum Games
od: Tsuchiya, Taira
Wydane: (2025)
od: Tsuchiya, Taira
Wydane: (2025)
Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
od: Lee, Jongyeong, i wsp.
Wydane: (2025)
od: Lee, Jongyeong, i wsp.
Wydane: (2025)
Revisiting Online Learning Approach to Inverse Linear Optimization: A Fenchel$-$Young Loss Perspective and Gap-Dependent Regret Analysis
od: Sakaue, Shinsaku, i wsp.
Wydane: (2025)
od: Sakaue, Shinsaku, i wsp.
Wydane: (2025)
Data- and Variance-dependent Regret Bounds for Online Tabular MDPs
od: Li, Mingyi, i wsp.
Wydane: (2026)
od: Li, Mingyi, i wsp.
Wydane: (2026)
Online Control of Linear Systems under Unbounded Noise
od: Ito, Kaito, i wsp.
Wydane: (2024)
od: Ito, Kaito, i wsp.
Wydane: (2024)
Logarithmic Regret for Online KL-Regularized Reinforcement Learning
od: Zhao, Heyang, i wsp.
Wydane: (2025)
od: Zhao, Heyang, i wsp.
Wydane: (2025)
Thompson Exploration with Best Challenger Rule in Best Arm Identification
od: Lee, Jongyeong, i wsp.
Wydane: (2023)
od: Lee, Jongyeong, i wsp.
Wydane: (2023)
Online Structured Prediction with Fenchel--Young Losses and Improved Surrogate Regret for Online Multiclass Classification with Logistic Loss
od: Sakaue, Shinsaku, i wsp.
Wydane: (2024)
od: Sakaue, Shinsaku, i wsp.
Wydane: (2024)
Logarithmic Regret of Exploration in Average Reward Markov Decision Processes
od: Boone, Victor, i wsp.
Wydane: (2025)
od: Boone, Victor, i wsp.
Wydane: (2025)
Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning
od: Chen, Baiyuan, i wsp.
Wydane: (2025)
od: Chen, Baiyuan, i wsp.
Wydane: (2025)
Heavy-tailed Linear Bandits: Adversarial Robustness, Best-of-both-worlds, and Beyond
od: Zhao, Canzhe, i wsp.
Wydane: (2025)
od: Zhao, Canzhe, i wsp.
Wydane: (2025)
Logarithmic Regret for Nonlinear Control
od: Wang, James, i wsp.
Wydane: (2025)
od: Wang, James, i wsp.
Wydane: (2025)
Note on Follow-the-Perturbed-Leader in Combinatorial Semi-Bandit Problems
od: Chen, Botao, i wsp.
Wydane: (2025)
od: Chen, Botao, i wsp.
Wydane: (2025)
Simple Projection-Free Algorithm for Contextual Recommendation with Logarithmic Regret and Robustness
od: Sakaue, Shinsaku
Wydane: (2026)
od: Sakaue, Shinsaku
Wydane: (2026)
Achieving Logarithmic Regret in KL-Regularized Zero-Sum Markov Games
od: Nayak, Anupam, i wsp.
Wydane: (2025)
od: Nayak, Anupam, i wsp.
Wydane: (2025)
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
od: Zhong, Han, i wsp.
Wydane: (2023)
od: Zhong, Han, i wsp.
Wydane: (2023)
Multi-Player Approaches for Dueling Bandits
od: Raveh, Or, i wsp.
Wydane: (2024)
od: Raveh, Or, i wsp.
Wydane: (2024)
Instance-Dependent Regret Bounds for Nonstochastic Linear Partial Monitoring
od: Di Gennaro, Federico, i wsp.
Wydane: (2025)
od: Di Gennaro, Federico, i wsp.
Wydane: (2025)
Rate-optimal Design for Anytime Best Arm Identification
od: Komiyama, Junpei, i wsp.
Wydane: (2025)
od: Komiyama, Junpei, i wsp.
Wydane: (2025)
The Survival Bandit Problem
od: Riou, Charles, i wsp.
Wydane: (2022)
od: Riou, Charles, i wsp.
Wydane: (2022)
A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms
od: Baudry, Dorian, i wsp.
Wydane: (2023)
od: Baudry, Dorian, i wsp.
Wydane: (2023)
Finite-Time Logarithmic Bayes Regret Upper Bounds
od: Atsidakou, Alexia, i wsp.
Wydane: (2023)
od: Atsidakou, Alexia, i wsp.
Wydane: (2023)
Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect
od: Neopane, Ojash, i wsp.
Wydane: (2024)
od: Neopane, Ojash, i wsp.
Wydane: (2024)
Globalized Adversarial Regret Optimization: Robust Decisions with Uncalibrated Predictions
od: Kurtz, Jannis, i wsp.
Wydane: (2026)
od: Kurtz, Jannis, i wsp.
Wydane: (2026)
Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit
od: Zhou, Julien, i wsp.
Wydane: (2024)
od: Zhou, Julien, i wsp.
Wydane: (2024)
Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal
od: Ziomek, Juliusz, i wsp.
Wydane: (2024)
od: Ziomek, Juliusz, i wsp.
Wydane: (2024)
Podobne zapisy
-
Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds
od: Ito, Shinji, i wsp.
Wydane: (2024) -
Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds
od: Tsuchiya, Taira, i wsp.
Wydane: (2023) -
A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ(T^{2/3})$ and its Application to Best-of-Both-Worlds
od: Tsuchiya, Taira, i wsp.
Wydane: (2024) -
Adversarial Learning in Games with Bandit Feedback: Logarithmic Pure-Strategy Maximin Regret
od: Ito, Shinji, i wsp.
Wydane: (2026) -
Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)