Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2206.04841 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908469644230656 |
|---|---|
| author | Sokoloski, Sacha Berens, Philipp |
| author_facet | Sokoloski, Sacha Berens, Philipp |
| contents | We introduce hierarchical mixtures of Gaussians (HMoGs), which unify dimensionality reduction and clustering into a single probabilistic model. HMoGs provide closed-form expressions for the model likelihood, exact inference over latent states and cluster membership, and exact algorithms for maximum-likelihood optimization. The novel exponential family parameterization of HMoGs greatly reduces their computational complexity relative to similar model-based methods, allowing them to efficiently model hundreds of latent dimensions, and thereby capture additional structure in high-dimensional data. We demonstrate HMoGs on synthetic experiments and MNIST, and show how joint optimization of dimensionality reduction and clustering facilitates increased model performance. We also explore how sparsity-constrained dimensionality reduction can further improve clustering performance while encouraging interpretability. By bridging classical statistical modelling with the scale of modern data and compute, HMoGs offer a practical approach to high-dimensional clustering that preserves statistical rigour, interpretability, and uncertainty quantification that is often missing from embedding-based, variational, and self-supervised methods. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2206_04841 |
| institution | arXiv |
| publishDate | 2022 |
| record_format | arxiv |
| spellingShingle | Hierarchical mixtures of Gaussians for combined dimensionality reduction and clustering Sokoloski, Sacha Berens, Philipp Machine Learning We introduce hierarchical mixtures of Gaussians (HMoGs), which unify dimensionality reduction and clustering into a single probabilistic model. HMoGs provide closed-form expressions for the model likelihood, exact inference over latent states and cluster membership, and exact algorithms for maximum-likelihood optimization. The novel exponential family parameterization of HMoGs greatly reduces their computational complexity relative to similar model-based methods, allowing them to efficiently model hundreds of latent dimensions, and thereby capture additional structure in high-dimensional data. We demonstrate HMoGs on synthetic experiments and MNIST, and show how joint optimization of dimensionality reduction and clustering facilitates increased model performance. We also explore how sparsity-constrained dimensionality reduction can further improve clustering performance while encouraging interpretability. By bridging classical statistical modelling with the scale of modern data and compute, HMoGs offer a practical approach to high-dimensional clustering that preserves statistical rigour, interpretability, and uncertainty quantification that is often missing from embedding-based, variational, and self-supervised methods. |
| title | Hierarchical mixtures of Gaussians for combined dimensionality reduction and clustering |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2206.04841 |