Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Udgivet: |
2022
|
| Fag: | |
| Online adgang: | https://arxiv.org/abs/2202.02464 |
| Tags: |
Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!
|
Indholdsfortegnelse:
- This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-$k$ nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the $k$-NNs are found for a query point with respect to each subset of data. We propose \emph{optimal} rules to aggregate the fixed-$k$-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed $k$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed $k$-NN rules with $M$ groups has a performance comparable to the standard $Θ(kM)$-NN rules even for fixed $k$.