Saved in:
Bibliographic Details
Main Authors: Feldman, Michael J., Misiakiewicz, Theodor, Romanov, Elad
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.21038
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915233737474048
author Feldman, Michael J.
Misiakiewicz, Theodor
Romanov, Elad
author_facet Feldman, Michael J.
Misiakiewicz, Theodor
Romanov, Elad
contents This work studies estimation of sparse principal components in high dimensions. Specifically, we consider a class of estimators based on kernel PCA, generalizing the covariance thresholding algorithm proposed by Krauthgamer et al. (2015). Focusing on Johnstone's spiked covariance model, we investigate the "critical" sparsity regime, where the sparsity level $m$, sample size $n$, and dimension $p$ each diverge and $m/\sqrt{n} \rightarrow β$, $p/n \rightarrow γ$. Within this framework, we develop a fine-grained understanding of signal detection and recovery. Our results establish a detectability phase transition, analogous to the Baik--Ben Arous--Péché (BBP) transition: above a certain threshold -- depending on the kernel function, $γ$, and $β$ -- kernel PCA is informative. Conversely, below the threshold, kernel principal components are asymptotically orthogonal to the signal. Notably, above this detection threshold, we find that consistent support recovery is possible with high probability. Sparsity plays a key role in our analysis, and results in more nuanced phenomena than in related studies of kernel PCA with delocalized (dense) components. Finally, we identify optimal kernel functions for detection -- and consequently, support recovery -- and numerical calculations suggest that soft thresholding is nearly optimal.
format Preprint
id arxiv_https___arxiv_org_abs_2412_21038
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Sparse PCA: Phase Transitions in the Critical Regime
Feldman, Michael J.
Misiakiewicz, Theodor
Romanov, Elad
Statistics Theory
62H25, 62H12
This work studies estimation of sparse principal components in high dimensions. Specifically, we consider a class of estimators based on kernel PCA, generalizing the covariance thresholding algorithm proposed by Krauthgamer et al. (2015). Focusing on Johnstone's spiked covariance model, we investigate the "critical" sparsity regime, where the sparsity level $m$, sample size $n$, and dimension $p$ each diverge and $m/\sqrt{n} \rightarrow β$, $p/n \rightarrow γ$. Within this framework, we develop a fine-grained understanding of signal detection and recovery. Our results establish a detectability phase transition, analogous to the Baik--Ben Arous--Péché (BBP) transition: above a certain threshold -- depending on the kernel function, $γ$, and $β$ -- kernel PCA is informative. Conversely, below the threshold, kernel principal components are asymptotically orthogonal to the signal. Notably, above this detection threshold, we find that consistent support recovery is possible with high probability. Sparsity plays a key role in our analysis, and results in more nuanced phenomena than in related studies of kernel PCA with delocalized (dense) components. Finally, we identify optimal kernel functions for detection -- and consequently, support recovery -- and numerical calculations suggest that soft thresholding is nearly optimal.
title Sparse PCA: Phase Transitions in the Critical Regime
topic Statistics Theory
62H25, 62H12
url https://arxiv.org/abs/2412.21038