MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autore principale:	Shulakov, Volodymyr
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Machine Learning Software Engineering G.3
Accesso online:	https://arxiv.org/abs/2407.13016
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866911959340810240
author	Shulakov, Volodymyr
author_facet	Shulakov, Volodymyr
contents	Synthetic tabular data is becoming a necessity as concerns about data privacy intensify in the world. Tabular data can be useful for testing various systems, simulating real data, analyzing the data itself or building predictive models. Unfortunately, such data may not be available due to confidentiality issues. Previous techniques, such as TVAE (Xu et al., 2019) or OCTGAN (Kim et al., 2021), are either unable to handle particularly complex datasets, or are complex in themselves, resulting in inferior run time performance. This paper introduces PSVAE, a new simple model that is capable of producing high-quality synthetic data in less run time. PSVAE incorporates two key ideas: loss optimization and post-selection. Along with these ideas, the proposed model compensates for underrepresented categories and uses a modern activation function, Mish (Misra, 2019).
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_13016
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	High-Quality Tabular Data Generation using Post-Selected VAE Shulakov, Volodymyr Machine Learning Software Engineering G.3 Synthetic tabular data is becoming a necessity as concerns about data privacy intensify in the world. Tabular data can be useful for testing various systems, simulating real data, analyzing the data itself or building predictive models. Unfortunately, such data may not be available due to confidentiality issues. Previous techniques, such as TVAE (Xu et al., 2019) or OCTGAN (Kim et al., 2021), are either unable to handle particularly complex datasets, or are complex in themselves, resulting in inferior run time performance. This paper introduces PSVAE, a new simple model that is capable of producing high-quality synthetic data in less run time. PSVAE incorporates two key ideas: loss optimization and post-selection. Along with these ideas, the proposed model compensates for underrepresented categories and uses a modern activation function, Mish (Misra, 2019).
title	High-Quality Tabular Data Generation using Post-Selected VAE
topic	Machine Learning Software Engineering G.3
url	https://arxiv.org/abs/2407.13016

Documenti analoghi