Indholdsfortegnelse: :: Library Catalog

Saved in:

Bibliografiske detaljer
Main Authors:	Ryu, J. Jon, Kim, Young-Han
Format:	Preprint
Udgivet:	2022
Fag:	Statistics Theory Distributed, Parallel, and Cluster Computing Information Theory Machine Learning
Online adgang:	https://arxiv.org/abs/2202.02464
Tags:	Tilføj Tag Ingen Tags, Vær først til at tagge denne postø!

Indholdsfortegnelse:

This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-$k$ nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the $k$-NNs are found for a query point with respect to each subset of data. We propose \emph{optimal} rules to aggregate the fixed-$k$-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed $k$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed $k$-NN rules with $M$ groups has a performance comparable to the standard $Θ(kM)$-NN rules even for fixed $k$.

Lignende værker