Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.14830 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912203351785472 |
|---|---|
| author | Gaudio, Julia Guan, Charlie K. |
| author_facet | Gaudio, Julia Guan, Charlie K. |
| contents | This paper considers the problem of label recovery in random graphs and matrices. Motivated by transitive behavior in real-world networks (i.e., ``the friend of my friend is my friend''), a recent line of work considers spatially-embedded networks, which exhibit transitive behavior. In particular, the Geometric Hidden Community Model (GHCM), introduced by Gaudio, Guan, Niu, and Wei, models a network as a labeled Poisson point process where every pair of vertices is associated with a pairwise observation whose distribution depends on the labels and positions of the vertices. The GHCM is in turn a generalization of the Geometric SBM (proposed by Baccelli and Sankararaman). Gaudio et al. provided a threshold below which exact recovery is information-theoretically impossible. Above the threshold, they provided a linear-time algorithm that succeeds in exact recovery under a certain ``distinctness-of-distributions'' assumption, which they conjectured to be unnecessary. In this paper, we partially resolve the conjecture by showing that the threshold is indeed tight for the two-community GHCM. We provide a two-phase, linear-time algorithm that explores the spatial graph in a data-driven manner in Phase I to yield an almost exact labeling, which is refined to achieve exact recovery in Phase II. Our results extend achievability to geometric formulations of well-known inference problems, such as the planted dense subgraph problem and submatrix localization, in which the distinctness-of-distributions assumption does not hold. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2501_14830 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Sharp exact recovery threshold for two-community Euclidean random graphs Gaudio, Julia Guan, Charlie K. Social and Information Networks Probability This paper considers the problem of label recovery in random graphs and matrices. Motivated by transitive behavior in real-world networks (i.e., ``the friend of my friend is my friend''), a recent line of work considers spatially-embedded networks, which exhibit transitive behavior. In particular, the Geometric Hidden Community Model (GHCM), introduced by Gaudio, Guan, Niu, and Wei, models a network as a labeled Poisson point process where every pair of vertices is associated with a pairwise observation whose distribution depends on the labels and positions of the vertices. The GHCM is in turn a generalization of the Geometric SBM (proposed by Baccelli and Sankararaman). Gaudio et al. provided a threshold below which exact recovery is information-theoretically impossible. Above the threshold, they provided a linear-time algorithm that succeeds in exact recovery under a certain ``distinctness-of-distributions'' assumption, which they conjectured to be unnecessary. In this paper, we partially resolve the conjecture by showing that the threshold is indeed tight for the two-community GHCM. We provide a two-phase, linear-time algorithm that explores the spatial graph in a data-driven manner in Phase I to yield an almost exact labeling, which is refined to achieve exact recovery in Phase II. Our results extend achievability to geometric formulations of well-known inference problems, such as the planted dense subgraph problem and submatrix localization, in which the distinctness-of-distributions assumption does not hold. |
| title | Sharp exact recovery threshold for two-community Euclidean random graphs |
| topic | Social and Information Networks Probability |
| url | https://arxiv.org/abs/2501.14830 |