Saved in:
Bibliographic Details
Main Authors: Chandra, Noirrit Kiran, Dunson, David B., Xu, Jason
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2305.04113
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929471862341632
author Chandra, Noirrit Kiran
Dunson, David B.
Xu, Jason
author_facet Chandra, Noirrit Kiran
Dunson, David B.
Xu, Jason
contents Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, for instance in reproducing studies across research groups. In such cases, it is natural to seek to learn the shared versus condition-specific structure. Existing hierarchical extensions of factor analysis have been proposed, but face practical issues including identifiability problems. To address these shortcomings, we propose a class of SUbspace Factor Analysis (SUFA) models, which characterize variation across groups at the level of a lower-dimensional subspace. We prove that the proposed class of SUFA models lead to identifiability of the shared versus group-specific components of the covariance, and study their posterior contraction properties. Taking a Bayesian approach, these contributions are developed alongside efficient posterior computation algorithms. Our sampler fully integrates out latent variables, is easily parallelizable and has complexity that does not depend on sample size. We illustrate the methods through application to integration of multiple gene expression datasets relevant to immunology.
format Preprint
id arxiv_https___arxiv_org_abs_2305_04113
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis
Chandra, Noirrit Kiran
Dunson, David B.
Xu, Jason
Methodology
Statistics Theory
Computation
Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, for instance in reproducing studies across research groups. In such cases, it is natural to seek to learn the shared versus condition-specific structure. Existing hierarchical extensions of factor analysis have been proposed, but face practical issues including identifiability problems. To address these shortcomings, we propose a class of SUbspace Factor Analysis (SUFA) models, which characterize variation across groups at the level of a lower-dimensional subspace. We prove that the proposed class of SUFA models lead to identifiability of the shared versus group-specific components of the covariance, and study their posterior contraction properties. Taking a Bayesian approach, these contributions are developed alongside efficient posterior computation algorithms. Our sampler fully integrates out latent variables, is easily parallelizable and has complexity that does not depend on sample size. We illustrate the methods through application to integration of multiple gene expression datasets relevant to immunology.
title Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis
topic Methodology
Statistics Theory
Computation
url https://arxiv.org/abs/2305.04113