:: Library Catalog

Okładka

Zapisane w:

Opis bibliograficzny
Główni autorzy:	Tsuchiya, Taira, Ito, Shinji, Honda, Junya
Format:	Preprint
Wydane:	2024
Hasła przedmiotowe:	Machine Learning
Dostęp online:	https://arxiv.org/abs/2402.08321
Etykiety:	Dodaj etykietę Nie ma etykietki, Dołącz pierwszą etykiete!

Podobne zapisy

Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds
od: Ito, Shinji, i wsp.
Wydane: (2024)

Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds
od: Tsuchiya, Taira, i wsp.
Wydane: (2023)

A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ(T^{2/3})$ and its Application to Best-of-Both-Worlds
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)

Adversarial Learning in Games with Bandit Feedback: Logarithmic Pure-Strategy Maximin Regret
od: Ito, Shinji, i wsp.
Wydane: (2026)

Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)

Reinforcement Learning from Adversarial Preferences in Tabular MDPs
od: Tsuchiya, Taira, i wsp.
Wydane: (2025)

Online Inverse Linear Optimization: Efficient Logarithmic-Regret Algorithm, Robustness to Suboptimality, and Lower Bound
od: Sakaue, Shinsaku, i wsp.
Wydane: (2025)

Follow-the-Perturbed-Leader with Fréchet-type Tail Distributions: Optimality in Adversarial Bandits and Best-of-Both-Worlds
od: Lee, Jongyeong, i wsp.
Wydane: (2024)

Learning with Posterior Sampling for Revenue Management under Time-varying Demand
od: Shimizu, Kazuma, i wsp.
Wydane: (2024)

Adapting to Stochastic and Adversarial Losses in Episodic MDPs with Aggregate Bandit Feedback
od: Ito, Shinji, i wsp.
Wydane: (2025)

Corrupted Learning Dynamics in Games
od: Tsuchiya, Taira, i wsp.
Wydane: (2024)

Scale-Invariant Fast Convergence in Games
od: Tsuchiya, Taira, i wsp.
Wydane: (2026)

Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback
od: Ito, Shinji, i wsp.
Wydane: (2025)

Optimal Regret of Bernoulli Bandits under Global Differential Privacy
od: Azize, Achraf, i wsp.
Wydane: (2025)

Tight Regret Upper and Lower Bounds for Optimistic Hedge in Two-Player Zero-Sum Games
od: Tsuchiya, Taira
Wydane: (2025)

Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
od: Lee, Jongyeong, i wsp.
Wydane: (2025)

Revisiting Online Learning Approach to Inverse Linear Optimization: A Fenchel$-$Young Loss Perspective and Gap-Dependent Regret Analysis
od: Sakaue, Shinsaku, i wsp.
Wydane: (2025)

Data- and Variance-dependent Regret Bounds for Online Tabular MDPs
od: Li, Mingyi, i wsp.
Wydane: (2026)

Online Control of Linear Systems under Unbounded Noise
od: Ito, Kaito, i wsp.
Wydane: (2024)

Logarithmic Regret for Online KL-Regularized Reinforcement Learning
od: Zhao, Heyang, i wsp.
Wydane: (2025)

Thompson Exploration with Best Challenger Rule in Best Arm Identification
od: Lee, Jongyeong, i wsp.
Wydane: (2023)

Online Structured Prediction with Fenchel--Young Losses and Improved Surrogate Regret for Online Multiclass Classification with Logistic Loss
od: Sakaue, Shinsaku, i wsp.
Wydane: (2024)

Logarithmic Regret of Exploration in Average Reward Markov Decision Processes
od: Boone, Victor, i wsp.
Wydane: (2025)

Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning
od: Chen, Baiyuan, i wsp.
Wydane: (2025)

Heavy-tailed Linear Bandits: Adversarial Robustness, Best-of-both-worlds, and Beyond
od: Zhao, Canzhe, i wsp.
Wydane: (2025)

Logarithmic Regret for Nonlinear Control
od: Wang, James, i wsp.
Wydane: (2025)

Note on Follow-the-Perturbed-Leader in Combinatorial Semi-Bandit Problems
od: Chen, Botao, i wsp.
Wydane: (2025)

Simple Projection-Free Algorithm for Contextual Recommendation with Logarithmic Regret and Robustness
od: Sakaue, Shinsaku
Wydane: (2026)

Achieving Logarithmic Regret in KL-Regularized Zero-Sum Markov Games
od: Nayak, Anupam, i wsp.
Wydane: (2025)

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
od: Zhong, Han, i wsp.
Wydane: (2023)

Multi-Player Approaches for Dueling Bandits
od: Raveh, Or, i wsp.
Wydane: (2024)

Instance-Dependent Regret Bounds for Nonstochastic Linear Partial Monitoring
od: Di Gennaro, Federico, i wsp.
Wydane: (2025)

Rate-optimal Design for Anytime Best Arm Identification
od: Komiyama, Junpei, i wsp.
Wydane: (2025)

The Survival Bandit Problem
od: Riou, Charles, i wsp.
Wydane: (2022)

A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms
od: Baudry, Dorian, i wsp.
Wydane: (2023)

Finite-Time Logarithmic Bayes Regret Upper Bounds
od: Atsidakou, Alexia, i wsp.
Wydane: (2023)

Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect
od: Neopane, Ojash, i wsp.
Wydane: (2024)

Globalized Adversarial Regret Optimization: Robust Decisions with Uncalibrated Predictions
od: Kurtz, Jannis, i wsp.
Wydane: (2026)

Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit
od: Zhou, Julien, i wsp.
Wydane: (2024)

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal
od: Ziomek, Juliusz, i wsp.
Wydane: (2024)