Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.02405 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916985835618304 |
|---|---|
| author | Ounissi, Oussama Jävergård, Nicklas Muntean, Adrian |
| author_facet | Ounissi, Oussama Jävergård, Nicklas Muntean, Adrian |
| contents | This work introduces the application of the Orthogonal Procrustes problem to the generation of synthetic data. The proposed methodology ensures that the resulting synthetic data preserves important statistical relationships among features, specifically the Pearson correlation. An empirical illustration using a large, real-world, tabular dataset of energy consumption demonstrates the effectiveness of the approach and highlights its potential for application in practical synthetic data generation. Our approach is not meant to replace existing generative models, but rather as a lightweight post-processing step that enforces exact Pearson correlation to an already generated synthetic dataset. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2510_02405 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Orthogonal Procrustes problem preserves correlations in synthetic data Ounissi, Oussama Jävergård, Nicklas Muntean, Adrian Methodology Statistics Theory Machine Learning 47A55, 15A18, 15-03 This work introduces the application of the Orthogonal Procrustes problem to the generation of synthetic data. The proposed methodology ensures that the resulting synthetic data preserves important statistical relationships among features, specifically the Pearson correlation. An empirical illustration using a large, real-world, tabular dataset of energy consumption demonstrates the effectiveness of the approach and highlights its potential for application in practical synthetic data generation. Our approach is not meant to replace existing generative models, but rather as a lightweight post-processing step that enforces exact Pearson correlation to an already generated synthetic dataset. |
| title | Orthogonal Procrustes problem preserves correlations in synthetic data |
| topic | Methodology Statistics Theory Machine Learning 47A55, 15A18, 15-03 |
| url | https://arxiv.org/abs/2510.02405 |