Saved in:
Bibliographic Details
Main Authors: Bosch-Romeu, Raquel, Falcó, Antonio, Rodríguez-Gallego, osé-Antonio
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.01975
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912937342402560
author Bosch-Romeu, Raquel
Falcó, Antonio
Rodríguez-Gallego, osé-Antonio
author_facet Bosch-Romeu, Raquel
Falcó, Antonio
Rodríguez-Gallego, osé-Antonio
contents We introduce a supervised dimensionality reduction methodology for categorical (and discretized mixed-type) data based on a density-matrix construction induced by class-conditional frequencies. Given a labeled dataset encoded in a one-hot survey space, we assemble a frequency matrix whose columns aggregate feature occurrences within each class, and define a normalized Gram-type operator that satisfies the axioms of a density matrix. The resulting representation admits an intrinsic rank bound controlled by the number of classes, enabling low-dimensional spectral embeddings via dominant eigenmodes. Classification is performed in the reduced space through class-conditional kernel density estimation and a maximum-likelihood decision rule. We establish structural invariances, provide complexity estimates, and validate the approach on synthetic benchmarks probing high cardinality, sparsity, noise, and class imbalance.
format Preprint
id arxiv_https___arxiv_org_abs_2603_01975
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Density-Matrix Spectral Embeddings for Categorical Data: Operator Structure and Stability
Bosch-Romeu, Raquel
Falcó, Antonio
Rodríguez-Gallego, osé-Antonio
Machine Learning
Numerical Analysis
15A18, 5A83, 65F15, 62H30, 62G07
We introduce a supervised dimensionality reduction methodology for categorical (and discretized mixed-type) data based on a density-matrix construction induced by class-conditional frequencies. Given a labeled dataset encoded in a one-hot survey space, we assemble a frequency matrix whose columns aggregate feature occurrences within each class, and define a normalized Gram-type operator that satisfies the axioms of a density matrix. The resulting representation admits an intrinsic rank bound controlled by the number of classes, enabling low-dimensional spectral embeddings via dominant eigenmodes. Classification is performed in the reduced space through class-conditional kernel density estimation and a maximum-likelihood decision rule. We establish structural invariances, provide complexity estimates, and validate the approach on synthetic benchmarks probing high cardinality, sparsity, noise, and class imbalance.
title Density-Matrix Spectral Embeddings for Categorical Data: Operator Structure and Stability
topic Machine Learning
Numerical Analysis
15A18, 5A83, 65F15, 62H30, 62G07
url https://arxiv.org/abs/2603.01975