Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Madhukiran Vaddi
Format:	Recurso digital
Language:
Published:	Zenodo 2026
Online Access:	https://doi.org/10.5281/zenodo.18743782
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866901704726806528
author	Madhukiran Vaddi
author_facet	Madhukiran Vaddi
contents	<p>The dramatic increase in the number of artificial intelligence applications requires huge data sets that are balanced in terms of fidelity, privacy, and utility. Synthetic data generation has become a paramount remedy to privacy regulations, lack of data, and regulatory hurdles in the medical, financial, and autonomous domains. The classical generative models have inherent problems of distributional precision, mode collapse, privacy assurance, and computing efficiency. The Context-Aware Distribution-Adaptive Synthetic Generator framework deals with these shortcomings by jointly optimizing distributional consistency, privacy, and downstream utility. It is a combination of Wasserstein distance-based distribution matching, adaptive noise injection, covariance preservation, and hybrid GAN-VAE optimization. Context-aware caching schemes provide the opportunity of distributional modeling at fine-grained demographic, time-based, and operational segments with a guarantee of differential privacy. Experimental evaluation on standard tabular datasets shows that there are significant gains in distributional fidelity, downstream task performance, privacy preservation, and computational efficiency over standard generative methods. The framework provides building blocks to scalable, production-grade synthetic data pipelines that can be deployed to regulated, privacy-sensitive systems where optimization of many competing goals simultaneously is needed in order to have the functionality to be practically viable.</p>
format	Recurso digital
id	zenodo_https___doi_org_10_5281_zenodo_18743782
institution	Zenodo
language
publishDate	2026
publisher	Zenodo
record_format	zenodo
spellingShingle	Challenges And Innovations In Synthetic Data Generation: Toward Context-Aware, Privacy-Preserving, And High-Utility AI Data Madhukiran Vaddi <p>The dramatic increase in the number of artificial intelligence applications requires huge data sets that are balanced in terms of fidelity, privacy, and utility. Synthetic data generation has become a paramount remedy to privacy regulations, lack of data, and regulatory hurdles in the medical, financial, and autonomous domains. The classical generative models have inherent problems of distributional precision, mode collapse, privacy assurance, and computing efficiency. The Context-Aware Distribution-Adaptive Synthetic Generator framework deals with these shortcomings by jointly optimizing distributional consistency, privacy, and downstream utility. It is a combination of Wasserstein distance-based distribution matching, adaptive noise injection, covariance preservation, and hybrid GAN-VAE optimization. Context-aware caching schemes provide the opportunity of distributional modeling at fine-grained demographic, time-based, and operational segments with a guarantee of differential privacy. Experimental evaluation on standard tabular datasets shows that there are significant gains in distributional fidelity, downstream task performance, privacy preservation, and computational efficiency over standard generative methods. The framework provides building blocks to scalable, production-grade synthetic data pipelines that can be deployed to regulated, privacy-sensitive systems where optimization of many competing goals simultaneously is needed in order to have the functionality to be practically viable.</p>
title	Challenges And Innovations In Synthetic Data Generation: Toward Context-Aware, Privacy-Preserving, And High-Utility AI Data
url	https://doi.org/10.5281/zenodo.18743782

Similar Items