Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Liang, Carvalho, Luis
Format:	Preprint
Published:	2024
Subjects:	Methodology Computation
Online Access:	https://arxiv.org/abs/2403.14925
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908682489430016
author	Wang, Liang Carvalho, Luis
author_facet	Wang, Liang Carvalho, Luis
contents	We study a general factor analysis framework where the $n$-by-$p$ data matrix is assumed to follow a general exponential family distribution entry-wise. While this model framework has been proposed before, we here further relax its distributional assumption by using a quasi-likelihood setup. By parameterizing the mean-variance relationship on data entries, we additionally introduce a dispersion parameter and entry-wise weights to model large variations and missing values. The resulting model is thus not only robust to distribution misspecification but also more flexible and able to capture mean-dependent covariance structures of the data matrix. Our main focus is on efficient computational approaches to perform the factor analysis. Previous modeling frameworks rely on simulated maximum likelihood (SML) to find the factorization solution, but this method was shown to lead to asymptotic bias when the simulated sample size grows slower than the square root of the sample size $n$, eliminating its practical application for data matrices with large $n$. Borrowing from expectation-maximization (EM) and stochastic gradient descent (SGD), we investigate three estimation procedures based on iterative factorization updates. Our proposed solution does not show asymptotic biases, and scales even better for large matrix factorizations with error $O(1/p)$. To support our findings, we conduct simulation experiments and discuss its application in four case studies.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_14925
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Computational Approaches for Exponential-Family Factor Analysis Wang, Liang Carvalho, Luis Methodology Computation We study a general factor analysis framework where the $n$-by-$p$ data matrix is assumed to follow a general exponential family distribution entry-wise. While this model framework has been proposed before, we here further relax its distributional assumption by using a quasi-likelihood setup. By parameterizing the mean-variance relationship on data entries, we additionally introduce a dispersion parameter and entry-wise weights to model large variations and missing values. The resulting model is thus not only robust to distribution misspecification but also more flexible and able to capture mean-dependent covariance structures of the data matrix. Our main focus is on efficient computational approaches to perform the factor analysis. Previous modeling frameworks rely on simulated maximum likelihood (SML) to find the factorization solution, but this method was shown to lead to asymptotic bias when the simulated sample size grows slower than the square root of the sample size $n$, eliminating its practical application for data matrices with large $n$. Borrowing from expectation-maximization (EM) and stochastic gradient descent (SGD), we investigate three estimation procedures based on iterative factorization updates. Our proposed solution does not show asymptotic biases, and scales even better for large matrix factorizations with error $O(1/p)$. To support our findings, we conduct simulation experiments and discuss its application in four case studies.
title	Computational Approaches for Exponential-Family Factor Analysis
topic	Methodology Computation
url	https://arxiv.org/abs/2403.14925

Similar Items