:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xia, Yu, Wang, Chi-Hua, Mabry, Joshua, Cheng, Guang
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2406.13130
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Privacy Auditing Synthetic Data Release through Local Likelihood Attacks
by: Ward, Joshua, et al.
Published: (2025)

Finding Connections: Membership Inference Attacks for the Multi-Table Synthetic Data Setting
by: Ward, Joshua, et al.
Published: (2026)

Simulation-Based Benchmarking of Reinforcement Learning Agents for Personalized Retail Promotions
by: Xia, Yu, et al.
Published: (2024)

Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models
by: Ward, Joshua, et al.
Published: (2024)

Data Deletion for Linear Regression with Noisy SGD
by: Xia, Zhangjie, et al.
Published: (2024)

Improve Fidelity and Utility of Synthetic Credit Card Transaction Time Series from Data-centric Perspective
by: Hsieh, Din-Yin, et al.
Published: (2024)

Synth-MIA: A Testbed for Auditing Privacy Leakage in Tabular Data Synthesis
by: Ward, Joshua, et al.
Published: (2025)

When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation
by: Ward, Joshua, et al.
Published: (2025)

Downstream Task-Oriented Generative Model Selections on Synthetic Data Training for Fraud Detection Models
by: Cheng, Yinan, et al.
Published: (2024)

Risk In Context: Benchmarking Privacy Leakage of Foundation Models in Synthetic Tabular Data Generation
by: Byun, Jessup, et al.
Published: (2025)

Towards High Supervised Learning Utility Training Data Generation: Data Pruning and Column Reordering
by: Kwok, Tung Sum Thomas, et al.
Published: (2025)

Utility Theory of Synthetic Data Generation
by: Xu, Shirong, et al.
Published: (2023)

Discriminative Estimation of Total Variation Distance: A Fidelity Auditor for Generative Data
by: Tao, Lan, et al.
Published: (2024)

Ensembling Membership Inference Attacks Against Tabular Generative Models
by: Ward, Joshua, et al.
Published: (2025)

BadGD: A unified data-centric framework to identify gradient descent vulnerabilities
by: Wang, Chi-Hua, et al.
Published: (2024)

Structured Evaluation of Synthetic Tabular Data
by: Yang, Scott Cheng-Hsin, et al.
Published: (2024)

A Comprehensive Survey of Synthetic Tabular Data Generation
by: Shi, Ruxue, et al.
Published: (2025)

A Comprehensive Evaluation Framework for Synthetic Trip Data Generation in Public Transport
by: Wu, Yuanyuan, et al.
Published: (2025)

DEREC-SIMPRO: unlock Language Model benefits to advance Synthesis in Data Clean Room
by: Kwok, Tung Sum Thomas, et al.
Published: (2024)

GReaTER: Generate Realistic Tabular data after data Enhancement and Reduction
by: Kwok, Tung Sum Thomas, et al.
Published: (2025)

MatWheel: Addressing Data Scarcity in Materials Science Through Synthetic Data
by: Li, Wentao, et al.
Published: (2025)

Data Science In Olfaction
by: Agarwal, Vivek, et al.
Published: (2024)

Deep Learning in Single-Cell and Spatial Transcriptomics Data Analysis: Advances and Challenges from a Data Science Perspective
by: Ge, Shuang, et al.
Published: (2024)

Generating Synthetic Net Load Data with Physics-informed Diffusion Model
by: Zhang, Shaorong, et al.
Published: (2024)

Watermarking Generative Categorical Data
by: Gu, Bochao, et al.
Published: (2024)

The LLM Data Auditor: A Metric-oriented Survey on Quality and Trustworthiness in Evaluating Synthetic Data
by: Zhang, Kaituo, et al.
Published: (2026)

Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation
by: Segal, Bradley, et al.
Published: (2025)

Overcoming Pitfalls in Graph Contrastive Learning Evaluation: Toward Comprehensive Benchmarks
by: Ma, Qian, et al.
Published: (2024)

Comprehensive Exploration of Synthetic Data Generation: A Survey
by: Bauer, André, et al.
Published: (2024)

Defining 'Good': Evaluation Framework for Synthetic Smart Meter Data
by: Chai, Sheng, et al.
Published: (2024)

Memisis: Orchestrating and Evaluating Synthetic Data for Tabular Health Datasets
by: Nagesh, Nitish, et al.
Published: (2026)

FEST: A Unified Framework for Evaluating Synthetic Tabular Data
by: Niu, Weijie, et al.
Published: (2025)

Online Forgetting Process for Linear Regression Models
by: Li, Yuantong, et al.
Published: (2020)

Comprehensive Evaluation of GNN Training Systems: A Data Management Perspective
by: Yuan, Hao, et al.
Published: (2023)

Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data
by: Gill, Waris, et al.
Published: (2025)

Watermarking Generative Tabular Data
by: He, Hengzhi, et al.
Published: (2024)

FairRR: Pre-Processing for Group Fairness through Randomized Response
by: Zeng, Xianli, et al.
Published: (2024)

Reasoning-Driven Synthetic Data Generation and Evaluation
by: Davidson, Tim R., et al.
Published: (2026)

Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference
by: Amad, Harry, et al.
Published: (2025)

Evaluating Inter-Column Logical Relationships in Synthetic Tabular Data Generation
by: Long, Yunbo, et al.
Published: (2025)