Saved in:
Bibliographic Details
Main Authors: Gaudio, Julia, Guan, Charlie K.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.14830
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912203351785472
author Gaudio, Julia
Guan, Charlie K.
author_facet Gaudio, Julia
Guan, Charlie K.
contents This paper considers the problem of label recovery in random graphs and matrices. Motivated by transitive behavior in real-world networks (i.e., ``the friend of my friend is my friend''), a recent line of work considers spatially-embedded networks, which exhibit transitive behavior. In particular, the Geometric Hidden Community Model (GHCM), introduced by Gaudio, Guan, Niu, and Wei, models a network as a labeled Poisson point process where every pair of vertices is associated with a pairwise observation whose distribution depends on the labels and positions of the vertices. The GHCM is in turn a generalization of the Geometric SBM (proposed by Baccelli and Sankararaman). Gaudio et al. provided a threshold below which exact recovery is information-theoretically impossible. Above the threshold, they provided a linear-time algorithm that succeeds in exact recovery under a certain ``distinctness-of-distributions'' assumption, which they conjectured to be unnecessary. In this paper, we partially resolve the conjecture by showing that the threshold is indeed tight for the two-community GHCM. We provide a two-phase, linear-time algorithm that explores the spatial graph in a data-driven manner in Phase I to yield an almost exact labeling, which is refined to achieve exact recovery in Phase II. Our results extend achievability to geometric formulations of well-known inference problems, such as the planted dense subgraph problem and submatrix localization, in which the distinctness-of-distributions assumption does not hold.
format Preprint
id arxiv_https___arxiv_org_abs_2501_14830
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Sharp exact recovery threshold for two-community Euclidean random graphs
Gaudio, Julia
Guan, Charlie K.
Social and Information Networks
Probability
This paper considers the problem of label recovery in random graphs and matrices. Motivated by transitive behavior in real-world networks (i.e., ``the friend of my friend is my friend''), a recent line of work considers spatially-embedded networks, which exhibit transitive behavior. In particular, the Geometric Hidden Community Model (GHCM), introduced by Gaudio, Guan, Niu, and Wei, models a network as a labeled Poisson point process where every pair of vertices is associated with a pairwise observation whose distribution depends on the labels and positions of the vertices. The GHCM is in turn a generalization of the Geometric SBM (proposed by Baccelli and Sankararaman). Gaudio et al. provided a threshold below which exact recovery is information-theoretically impossible. Above the threshold, they provided a linear-time algorithm that succeeds in exact recovery under a certain ``distinctness-of-distributions'' assumption, which they conjectured to be unnecessary. In this paper, we partially resolve the conjecture by showing that the threshold is indeed tight for the two-community GHCM. We provide a two-phase, linear-time algorithm that explores the spatial graph in a data-driven manner in Phase I to yield an almost exact labeling, which is refined to achieve exact recovery in Phase II. Our results extend achievability to geometric formulations of well-known inference problems, such as the planted dense subgraph problem and submatrix localization, in which the distinctness-of-distributions assumption does not hold.
title Sharp exact recovery threshold for two-community Euclidean random graphs
topic Social and Information Networks
Probability
url https://arxiv.org/abs/2501.14830