Saved in:
Bibliographic Details
Main Authors: Wedenborg, A. Emilie J., Mørup, Morten
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.04172
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913681618501632
author Wedenborg, A. Emilie J.
Mørup, Morten
author_facet Wedenborg, A. Emilie J.
Mørup, Morten
contents Archetypal analysis (AA) is a matrix decomposition method that identifies distinct patterns using convex combinations of the data points denoted archetypes with each data point in turn reconstructed as convex combinations of the archetypes. AA thereby forms a polytope representing trade-offs of the distinct aspects in the data. Most existing methods for AA are designed for continuous data and do not exploit the structure of the data distribution. In this paper, we propose two new optimization frameworks for archetypal analysis for binary data. i) A second order approximation of the AA likelihood based on the Bernoulli distribution with efficient closed-form updates using an active set procedure for learning the convex combinations defining the archetypes, and a sequential minimal optimization strategy for learning the observation specific reconstructions. ii) A Bernoulli likelihood based version of the principal convex hull analysis (PCHA) algorithm originally developed for least squares optimization. We compare these approaches with the only existing binary AA procedure relying on multiplicative updates and demonstrate their superiority on both synthetic and real binary data. Notably, the proposed optimization frameworks for AA can easily be extended to other data distributions providing generic efficient optimization frameworks for AA based on tailored likelihood functions reflecting the underlying data distribution.
format Preprint
id arxiv_https___arxiv_org_abs_2502_04172
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Archetypal Analysis for Binary Data
Wedenborg, A. Emilie J.
Mørup, Morten
Machine Learning
Artificial Intelligence
Archetypal analysis (AA) is a matrix decomposition method that identifies distinct patterns using convex combinations of the data points denoted archetypes with each data point in turn reconstructed as convex combinations of the archetypes. AA thereby forms a polytope representing trade-offs of the distinct aspects in the data. Most existing methods for AA are designed for continuous data and do not exploit the structure of the data distribution. In this paper, we propose two new optimization frameworks for archetypal analysis for binary data. i) A second order approximation of the AA likelihood based on the Bernoulli distribution with efficient closed-form updates using an active set procedure for learning the convex combinations defining the archetypes, and a sequential minimal optimization strategy for learning the observation specific reconstructions. ii) A Bernoulli likelihood based version of the principal convex hull analysis (PCHA) algorithm originally developed for least squares optimization. We compare these approaches with the only existing binary AA procedure relying on multiplicative updates and demonstrate their superiority on both synthetic and real binary data. Notably, the proposed optimization frameworks for AA can easily be extended to other data distributions providing generic efficient optimization frameworks for AA based on tailored likelihood functions reflecting the underlying data distribution.
title Archetypal Analysis for Binary Data
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2502.04172