Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Weng, Zhaotian, Gao, Zijun, Andrews, Jerone, Zhao, Jieyu
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence Computation and Language Computer Vision and Pattern Recognition I.2.7
Online Access:	https://arxiv.org/abs/2407.02814
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918047099387904
author	Weng, Zhaotian Gao, Zijun Andrews, Jerone Zhao, Jieyu
author_facet	Weng, Zhaotian Gao, Zijun Andrews, Jerone Zhao, Jieyu
contents	Vision-language models (VLMs) pre-trained on extensive datasets can inadvertently learn biases by correlating gender information with specific objects or scenarios. Current methods, which focus on modifying inputs and monitoring changes in the model's output probability scores, often struggle to comprehensively understand bias from the perspective of model components. We propose a framework that incorporates causal mediation analysis to measure and map the pathways of bias generation and propagation within VLMs. This approach allows us to identify the direct effects of interventions on model bias and the indirect effects of interventions on bias mediated through different model components. Our results show that image features are the primary contributors to bias, with significantly higher impacts than text features, specifically accounting for 32.57% and 12.63% of the bias in the MSCOCO and PASCAL-SENTENCE datasets, respectively. Notably, the image encoder's contribution surpasses that of the text encoder and the deep fusion encoder. Further experimentation confirms that contributions from both language and vision modalities are aligned and non-conflicting. Consequently, focusing on blurring gender representations within the image encoder, which contributes most to the model bias, reduces bias efficiently by 22.03% and 9.04% in the MSCOCO and PASCAL-SENTENCE datasets, respectively, with minimal performance loss or increased computational demands.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_02814
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective Weng, Zhaotian Gao, Zijun Andrews, Jerone Zhao, Jieyu Artificial Intelligence Computation and Language Computer Vision and Pattern Recognition I.2.7 Vision-language models (VLMs) pre-trained on extensive datasets can inadvertently learn biases by correlating gender information with specific objects or scenarios. Current methods, which focus on modifying inputs and monitoring changes in the model's output probability scores, often struggle to comprehensively understand bias from the perspective of model components. We propose a framework that incorporates causal mediation analysis to measure and map the pathways of bias generation and propagation within VLMs. This approach allows us to identify the direct effects of interventions on model bias and the indirect effects of interventions on bias mediated through different model components. Our results show that image features are the primary contributors to bias, with significantly higher impacts than text features, specifically accounting for 32.57% and 12.63% of the bias in the MSCOCO and PASCAL-SENTENCE datasets, respectively. Notably, the image encoder's contribution surpasses that of the text encoder and the deep fusion encoder. Further experimentation confirms that contributions from both language and vision modalities are aligned and non-conflicting. Consequently, focusing on blurring gender representations within the image encoder, which contributes most to the model bias, reduces bias efficiently by 22.03% and 9.04% in the MSCOCO and PASCAL-SENTENCE datasets, respectively, with minimal performance loss or increased computational demands.
title	Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective
topic	Artificial Intelligence Computation and Language Computer Vision and Pattern Recognition I.2.7
url	https://arxiv.org/abs/2407.02814

Similar Items