Saved in:
Bibliographic Details
Main Authors: Carpentier, Alexandra, Verzelen, Nicolas
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.23023
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910034083971072
author Carpentier, Alexandra
Verzelen, Nicolas
author_facet Carpentier, Alexandra
Verzelen, Nicolas
contents We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, we investigate the requisite minimal distance $Δ$ between mean vectors to partially recover the underlying partition. While the minimax-optimal threshold for $Δ$ is well-established, a significant gap exists between this information-theoretic limit and the performance of known polynomial-time procedures. Although this gap was recently characterized in the high-dimensional regime ($n \leq dK$), it remains largely unexplored in the moderate-dimensional regime ($n \geq dK$). In this manuscript, we address this regime by establishing a new low-degree polynomial lower bound for the moderate-dimensional case when $d \geq K$. We show that while the difficulty of clustering for $n \leq dK$ is primarily driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a "non-parametric rate". We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.
format Preprint
id arxiv_https___arxiv_org_abs_2602_23023
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Low-degree Lower bounds for clustering in moderate dimension
Carpentier, Alexandra
Verzelen, Nicolas
Statistics Theory
Machine Learning
Probability
We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $\mathbb{R}^d$. Specifically, we investigate the requisite minimal distance $Δ$ between mean vectors to partially recover the underlying partition. While the minimax-optimal threshold for $Δ$ is well-established, a significant gap exists between this information-theoretic limit and the performance of known polynomial-time procedures. Although this gap was recently characterized in the high-dimensional regime ($n \leq dK$), it remains largely unexplored in the moderate-dimensional regime ($n \geq dK$). In this manuscript, we address this regime by establishing a new low-degree polynomial lower bound for the moderate-dimensional case when $d \geq K$. We show that while the difficulty of clustering for $n \leq dK$ is primarily driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a "non-parametric rate". We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.
title Low-degree Lower bounds for clustering in moderate dimension
topic Statistics Theory
Machine Learning
Probability
url https://arxiv.org/abs/2602.23023