Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bénézet, Cyril, Cheng, Ziteng, Jaimungal, Sebastian
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Statistics Theory
Online Access:	https://arxiv.org/abs/2406.09375
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910485663711232
author	Bénézet, Cyril Cheng, Ziteng Jaimungal, Sebastian
author_facet	Bénézet, Cyril Cheng, Ziteng Jaimungal, Sebastian
contents	We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes, allowing for different dimensions of the feature and target spaces. Our approach involves clustering data near varying query points in the feature space to create empirical measures in the target space. We employ two distinct clustering schemes: one based on a fixed-radius ball and the other on nearest neighbors. We establish upper bounds for the convergence rates of both methods and, from these bounds, deduce optimal configurations for the radius and the number of neighbors. We propose to incorporate the nearest neighbors method into neural network training, as our empirical analysis indicates it has better performance in practice. For efficiency, our training process utilizes approximate nearest neighbors search with random binary space partitioning. Additionally, we employ the Sinkhorn algorithm and a sparsity-enforced transport plan. Our empirical findings demonstrate that, with a suitably designed structure, the neural network has the ability to adapt to a suitable level of Lipschitz continuity locally. For reproducibility, our code is available at \url{https://github.com/zcheng-a/LCD_kNN}.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_09375
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Learning conditional distributions on continuous spaces Bénézet, Cyril Cheng, Ziteng Jaimungal, Sebastian Machine Learning Statistics Theory We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes, allowing for different dimensions of the feature and target spaces. Our approach involves clustering data near varying query points in the feature space to create empirical measures in the target space. We employ two distinct clustering schemes: one based on a fixed-radius ball and the other on nearest neighbors. We establish upper bounds for the convergence rates of both methods and, from these bounds, deduce optimal configurations for the radius and the number of neighbors. We propose to incorporate the nearest neighbors method into neural network training, as our empirical analysis indicates it has better performance in practice. For efficiency, our training process utilizes approximate nearest neighbors search with random binary space partitioning. Additionally, we employ the Sinkhorn algorithm and a sparsity-enforced transport plan. Our empirical findings demonstrate that, with a suitably designed structure, the neural network has the ability to adapt to a suitable level of Lipschitz continuity locally. For reproducibility, our code is available at \url{https://github.com/zcheng-a/LCD_kNN}.
title	Learning conditional distributions on continuous spaces
topic	Machine Learning Statistics Theory
url	https://arxiv.org/abs/2406.09375

Similar Items