:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Scheid, Antoine, Boursier, Etienne, Durmus, Alain, Jordan, Michael I., Ménard, Pierre, Moulines, Eric, Valko, Michal
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2410.17055
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality
by: Scheid, Antoine, et al.
Published: (2024)

Online Decision-Making in Tree-Like Multi-Agent Games with Transfers
by: Scheid, Antoine, et al.
Published: (2025)

Incentivized Learning in Principal-Agent Bandit Games
by: Scheid, Antoine, et al.
Published: (2024)

Online Decision-Focused Learning
by: Capitaine, Aymeric, et al.
Published: (2025)

Test-then-Punish: A Statistical Approach to Repeated Games
by: Capitaine, Aymeric, et al.
Published: (2026)

Unravelling in Collaborative Learning
by: Capitaine, Aymeric, et al.
Published: (2024)

Prediction-Aware Learning in Multi-Agent Systems
by: Capitaine, Aymeric, et al.
Published: (2025)

Demonstration-Regularized RL
by: Tiapkin, Daniil, et al.
Published: (2023)

Model-free Posterior Sampling via Learning Rate Randomization
by: Tiapkin, Daniil, et al.
Published: (2023)

Proximal Point Nash Learning from Human Feedback
by: Tiapkin, Daniil, et al.
Published: (2025)

Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier
by: Fiegel, Come, et al.
Published: (2026)

A single algorithm for both restless and rested rotting bandits
by: Seznec, Julien, et al.
Published: (2026)

Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians
by: Huix, Tom, et al.
Published: (2024)

Briding Diffusion Posterior Sampling and Monte Carlo methods: a survey
by: Janati, Yazid, et al.
Published: (2025)

On Sampling with Approximate Transport Maps
by: Grenioux, Louis, et al.
Published: (2023)

Scaffold with Stochastic Gradients: New Analysis with Linear Speed-Up
by: Mangold, Paul, et al.
Published: (2025)

Explaining and Preventing Alignment Collapse in Iterative RLHF
by: Gauthier, Etienne, et al.
Published: (2026)

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback
by: Fiegel, Côme, et al.
Published: (2026)

Piecewise deterministic generative models
by: Bertazzi, Andrea, et al.
Published: (2024)

Divide-and-Conquer Posterior Sampling for Denoising Diffusion Priors
by: Janati, Yazid, et al.
Published: (2024)

Categorical Reparameterization with Denoising Diffusion models
by: Gourevitch, Samson, et al.
Published: (2026)

Refined Analysis of Federated Averaging and Federated Richardson-Romberg
by: Mangold, Paul, et al.
Published: (2024)

Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance
by: Moufad, Badr, et al.
Published: (2025)

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
by: Zhu, Banghua, et al.
Published: (2024)

Planning in entropy-regularized Markov decision processes and games
by: Grill, Jean-Bastien, et al.
Published: (2026)

Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation
by: Gourevitch, Samson, et al.
Published: (2026)

Early alignment in two-layer networks training is a two-edged sword
by: Boursier, Etienne, et al.
Published: (2024)

Simplicity bias and optimization threshold in two-layer ReLU networks
by: Boursier, Etienne, et al.
Published: (2024)

Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective
by: Boursier, Etienne, et al.
Published: (2025)

Penalising the biases in norm regularisation enforces sparsity
by: Boursier, Etienne, et al.
Published: (2023)

A Mixture-Based Framework for Guiding Diffusion Models
by: Janati, Yazid, et al.
Published: (2025)

Bandits on graphs and structures
by: Valko, Michal
Published: (2026)

Adaptive graph-based algorithms for conditional anomaly detection and semi-supervised learning
by: Valko, Michal
Published: (2026)

A survey on multi-player bandits
by: Boursier, Etienne, et al.
Published: (2022)

Variational Diffusion Posterior Sampling with Midpoint Guidance
by: Moufad, Badr, et al.
Published: (2024)

Rosenthal-type inequalities for linear statistics of Markov chains
by: Durmus, Alain, et al.
Published: (2023)

Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
by: Sheshukova, Marina, et al.
Published: (2024)

Covariance-adapting algorithm for semi-bandits with application to sparse rewards
by: Perrault, Pierre, et al.
Published: (2026)

RLHF Workflow: From Reward Modeling to Online RLHF
by: Dong, Hanze, et al.
Published: (2024)

Joint Channel Selection using FedDRL in V2X
by: Mancini, Lorenzo, et al.
Published: (2024)