Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.01663 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- We study community recovery in the planted partition model in regimes where the number and sizes of communities may vary arbitrarily with the number of vertices. In such highly unbalanced settings, standard accuracy or overlap-based metrics become inadequate for assessing recovery performance. Instead, we propose the correlation coefficient between partitions as a recovery metric, which remains meaningful even when the number or sizes of communities differ substantially. We then analyze a simple common-neighbor-based clustering rule which groups two adjacent vertices if they share more than one common neighbor. We establish explicit recovery conditions under sparse inter-community connectivity, without requiring prior knowledge of the model parameters. In particular, in graphs of size $n$, this algorithm achieves exact recovery for communities with sizes $Ω(\log n)$, almost exact recovery for sizes $ω(1)$ and weak recovery for sizes $Ω(1)$. In contrast to most existing results, which assume (nearly) balanced communities, our method successfully recovers small and heterogeneously-sized communities, and improves existing guarantees even in some balanced settings. Finally, our results apply to community sizes that follow a power-law distribution, a characteristic frequently found in real-world networks.