Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Han, Xu, Fan, Fangfang, Rong, Jingzhao, Li, Zhen, Fakhri, Georges El, Chen, Qingyu, Liu, Xiaofeng
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2406.14847
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929664429129728
author	Han, Xu Fan, Fangfang Rong, Jingzhao Li, Zhen Fakhri, Georges El Chen, Qingyu Liu, Xiaofeng
author_facet	Han, Xu Fan, Fangfang Rong, Jingzhao Li, Zhen Fakhri, Georges El Chen, Qingyu Liu, Xiaofeng
contents	The text to medical image (T2MedI) with latent diffusion model has great potential to alleviate the scarcity of medical imaging data and explore the underlying appearance distribution of lesions in a specific patient status description. However, as the text to nature image models, we show that the T2MedI model can also bias to some subgroups to overlook the minority ones in the training set. In this work, we first build a T2MedI model based on the pre-trained Imagen model, which has the fixed contrastive language-image pre-training (CLIP) text encoder, while its decoder has been fine-tuned on medical images from the Radiology Objects in COntext (ROCO) dataset. Its gender bias is analyzed qualitatively and quantitatively. Toward this issue, we propose to fine-tune the T2MedI toward the target application dataset to align their sensitive subgroups distribution probability. Specifically, the alignment loss for fine-tuning is guided by an off-the-shelf sensitivity-subgroup classifier to match the classification probability between the generated images and the expected target dataset. In addition, the image quality is maintained by a CLIP-consistency regularization term following a knowledge distillation scheme. For evaluation, we set the target dataset to be enhanced as the BraST18 dataset, and trained a brain magnetic resonance (MR) slice-based gender classifier from it. With our method, the generated MR image can markedly reduce the inconsistency with the gender proportion in the BraTS18 dataset.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_14847
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning Han, Xu Fan, Fangfang Rong, Jingzhao Li, Zhen Fakhri, Georges El Chen, Qingyu Liu, Xiaofeng Computer Vision and Pattern Recognition The text to medical image (T2MedI) with latent diffusion model has great potential to alleviate the scarcity of medical imaging data and explore the underlying appearance distribution of lesions in a specific patient status description. However, as the text to nature image models, we show that the T2MedI model can also bias to some subgroups to overlook the minority ones in the training set. In this work, we first build a T2MedI model based on the pre-trained Imagen model, which has the fixed contrastive language-image pre-training (CLIP) text encoder, while its decoder has been fine-tuned on medical images from the Radiology Objects in COntext (ROCO) dataset. Its gender bias is analyzed qualitatively and quantitatively. Toward this issue, we propose to fine-tune the T2MedI toward the target application dataset to align their sensitive subgroups distribution probability. Specifically, the alignment loss for fine-tuning is guided by an off-the-shelf sensitivity-subgroup classifier to match the classification probability between the generated images and the expected target dataset. In addition, the image quality is maintained by a CLIP-consistency regularization term following a knowledge distillation scheme. For evaluation, we set the target dataset to be enhanced as the BraST18 dataset, and trained a brain magnetic resonance (MR) slice-based gender classifier from it. With our method, the generated MR image can markedly reduce the inconsistency with the gender proportion in the BraTS18 dataset.
title	Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.14847

Similar Items