Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	McDonald, Curtis, Barron, Andrew R
Format:	Preprint
Publié:	2024
Sujets:	Machine Learning Information Theory
Accès en ligne:	https://arxiv.org/abs/2407.18802
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866929438524964864
author	McDonald, Curtis Barron, Andrew R
author_facet	McDonald, Curtis Barron, Andrew R
contents	In this work, we present a sampling algorithm for single hidden layer neural networks. This algorithm is built upon a recursive series of Bayesian posteriors using a method we call Greedy Bayes. Sampling of the Bayesian posterior for neuron weight vectors $w$ of dimension $d$ is challenging because of its multimodality. Our algorithm to tackle this problem is based on a coupling of the posterior density for $w$ with an auxiliary random variable $ξ$. The resulting reverse conditional $w\|ξ$ of neuron weights given auxiliary random variable is shown to be log concave. In the construction of the posterior distributions we provide some freedom in the choice of the prior. In particular, for Gaussian priors on $w$ with suitably small variance, the resulting marginal density of the auxiliary variable $ξ$ is proven to be strictly log concave for all dimensions $d$. For a uniform prior on the unit $\ell_1$ ball, evidence is given that the density of $ξ$ is again strictly log concave for sufficiently large $d$. The score of the marginal density of the auxiliary random variable $ξ$ is determined by an expectation over $w\|ξ$ and thus can be computed by various rapidly mixing Markov Chain Monte Carlo methods. Moreover, the computation of the score of $ξ$ permits methods of sampling $ξ$ by a stochastic diffusion (Langevin dynamics) with drift function built from this score. With such dynamics, information-theoretic methods pioneered by Bakry and Emery show that accurate sampling of $ξ$ is obtained rapidly when its density is indeed strictly log-concave. After which, one more draw from $w\|ξ$, produces neuron weights $w$ whose marginal distribution is from the desired posterior.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_18802
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Log-Concave Coupling for Sampling Neural Net Posteriors McDonald, Curtis Barron, Andrew R Machine Learning Information Theory In this work, we present a sampling algorithm for single hidden layer neural networks. This algorithm is built upon a recursive series of Bayesian posteriors using a method we call Greedy Bayes. Sampling of the Bayesian posterior for neuron weight vectors $w$ of dimension $d$ is challenging because of its multimodality. Our algorithm to tackle this problem is based on a coupling of the posterior density for $w$ with an auxiliary random variable $ξ$. The resulting reverse conditional $w\|ξ$ of neuron weights given auxiliary random variable is shown to be log concave. In the construction of the posterior distributions we provide some freedom in the choice of the prior. In particular, for Gaussian priors on $w$ with suitably small variance, the resulting marginal density of the auxiliary variable $ξ$ is proven to be strictly log concave for all dimensions $d$. For a uniform prior on the unit $\ell_1$ ball, evidence is given that the density of $ξ$ is again strictly log concave for sufficiently large $d$. The score of the marginal density of the auxiliary random variable $ξ$ is determined by an expectation over $w\|ξ$ and thus can be computed by various rapidly mixing Markov Chain Monte Carlo methods. Moreover, the computation of the score of $ξ$ permits methods of sampling $ξ$ by a stochastic diffusion (Langevin dynamics) with drift function built from this score. With such dynamics, information-theoretic methods pioneered by Bakry and Emery show that accurate sampling of $ξ$ is obtained rapidly when its density is indeed strictly log-concave. After which, one more draw from $w\|ξ$, produces neuron weights $w$ whose marginal distribution is from the desired posterior.
title	Log-Concave Coupling for Sampling Neural Net Posteriors
topic	Machine Learning Information Theory
url	https://arxiv.org/abs/2407.18802

Documents similaires