Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.01975 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912937342402560 |
|---|---|
| author | Bosch-Romeu, Raquel Falcó, Antonio Rodríguez-Gallego, osé-Antonio |
| author_facet | Bosch-Romeu, Raquel Falcó, Antonio Rodríguez-Gallego, osé-Antonio |
| contents | We introduce a supervised dimensionality reduction methodology for categorical (and discretized mixed-type) data based on a density-matrix construction induced by class-conditional frequencies. Given a labeled dataset encoded in a one-hot survey space, we assemble a frequency matrix whose columns aggregate feature occurrences within each class, and define a normalized Gram-type operator that satisfies the axioms of a density matrix. The resulting representation admits an intrinsic rank bound controlled by the number of classes, enabling low-dimensional spectral embeddings via dominant eigenmodes.
Classification is performed in the reduced space through class-conditional kernel density estimation and a maximum-likelihood decision rule.
We establish structural invariances, provide complexity estimates, and validate the approach on synthetic benchmarks probing high cardinality, sparsity, noise, and class imbalance. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_01975 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Density-Matrix Spectral Embeddings for Categorical Data: Operator Structure and Stability Bosch-Romeu, Raquel Falcó, Antonio Rodríguez-Gallego, osé-Antonio Machine Learning Numerical Analysis 15A18, 5A83, 65F15, 62H30, 62G07 We introduce a supervised dimensionality reduction methodology for categorical (and discretized mixed-type) data based on a density-matrix construction induced by class-conditional frequencies. Given a labeled dataset encoded in a one-hot survey space, we assemble a frequency matrix whose columns aggregate feature occurrences within each class, and define a normalized Gram-type operator that satisfies the axioms of a density matrix. The resulting representation admits an intrinsic rank bound controlled by the number of classes, enabling low-dimensional spectral embeddings via dominant eigenmodes. Classification is performed in the reduced space through class-conditional kernel density estimation and a maximum-likelihood decision rule. We establish structural invariances, provide complexity estimates, and validate the approach on synthetic benchmarks probing high cardinality, sparsity, noise, and class imbalance. |
| title | Density-Matrix Spectral Embeddings for Categorical Data: Operator Structure and Stability |
| topic | Machine Learning Numerical Analysis 15A18, 5A83, 65F15, 62H30, 62G07 |
| url | https://arxiv.org/abs/2603.01975 |