Guardado en:
Detalles Bibliográficos
Autores principales: Tam, Edric, Engelhardt, Barbara E
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2501.00744
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866912973996425216
author Tam, Edric
Engelhardt, Barbara E
author_facet Tam, Edric
Engelhardt, Barbara E
contents Chest X-ray (CXR) images are among the most commonly used diagnostic imaging modalities in clinical practice. Stringent privacy constraints often limit the public dissemination of patient CXR images, contributing to the increasing use of synthetic images produced by deep generative models for data sharing and training machine learning models. Given the high-stakes downstream applications of CXR images, it is crucial to evaluate how faithfully synthetic images reflect the underlying target distribution. We propose the embedded characteristic score (ECS), a flexible evaluation procedure that compares synthetic and patient CXR samples through characteristic function transforms of feature embeddings. The choice of embedding can be tailored to the clinical or scientific context of interest. By leveraging the behavior of characteristic functions near the origin, ECS is sensitive to differences in higher moments and distribution tails, aspects that are often overlooked by commonly used evaluation metrics such as the Fréchet Inception Distance (FID). We establish theoretical properties of ECS and describe a calibration strategy based on a simple resampling procedure. We compare the empirical performance of ECS against FID via simulations and standard benchmark imaging datasets. Assessing synthetic CXR images with ECS uncovers clinically relevant distributional discrepancies relative to patient CXR images. These results highlight the importance of reliable evaluation of synthetic data that inform high-stakes decisions.
format Preprint
id arxiv_https___arxiv_org_abs_2501_00744
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Assessing the Distributional Fidelity of Synthetic Chest X-rays using the Embedded Characteristic Score
Tam, Edric
Engelhardt, Barbara E
Machine Learning
Chest X-ray (CXR) images are among the most commonly used diagnostic imaging modalities in clinical practice. Stringent privacy constraints often limit the public dissemination of patient CXR images, contributing to the increasing use of synthetic images produced by deep generative models for data sharing and training machine learning models. Given the high-stakes downstream applications of CXR images, it is crucial to evaluate how faithfully synthetic images reflect the underlying target distribution. We propose the embedded characteristic score (ECS), a flexible evaluation procedure that compares synthetic and patient CXR samples through characteristic function transforms of feature embeddings. The choice of embedding can be tailored to the clinical or scientific context of interest. By leveraging the behavior of characteristic functions near the origin, ECS is sensitive to differences in higher moments and distribution tails, aspects that are often overlooked by commonly used evaluation metrics such as the Fréchet Inception Distance (FID). We establish theoretical properties of ECS and describe a calibration strategy based on a simple resampling procedure. We compare the empirical performance of ECS against FID via simulations and standard benchmark imaging datasets. Assessing synthetic CXR images with ECS uncovers clinically relevant distributional discrepancies relative to patient CXR images. These results highlight the importance of reliable evaluation of synthetic data that inform high-stakes decisions.
title Assessing the Distributional Fidelity of Synthetic Chest X-rays using the Embedded Characteristic Score
topic Machine Learning
url https://arxiv.org/abs/2501.00744