Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Clerico, Eugenio, Farghly, Tyler, Deligiannidis, George, Guedj, Benjamin, Doucet, Arnaud
Format: Preprint
Veröffentlicht: 2022
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2209.02525
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866929707752095744
author Clerico, Eugenio
Farghly, Tyler
Deligiannidis, George
Guedj, Benjamin
Doucet, Arnaud
author_facet Clerico, Eugenio
Farghly, Tyler
Deligiannidis, George
Guedj, Benjamin
Doucet, Arnaud
contents We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.
format Preprint
id arxiv_https___arxiv_org_abs_2209_02525
institution arXiv
publishDate 2022
record_format arxiv
spellingShingle Generalisation under gradient descent via deterministic PAC-Bayes
Clerico, Eugenio
Farghly, Tyler
Deligiannidis, George
Guedj, Benjamin
Doucet, Arnaud
Machine Learning
We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.
title Generalisation under gradient descent via deterministic PAC-Bayes
topic Machine Learning
url https://arxiv.org/abs/2209.02525