Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Stoikou, Theodoti, Lymperaiou, Maria, Stamou, Giorgos
Format:	Preprint
Published:	2023
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2303.02601
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911863734796288
author	Stoikou, Theodoti Lymperaiou, Maria Stamou, Giorgos
author_facet	Stoikou, Theodoti Lymperaiou, Maria Stamou, Giorgos
contents	Visual Question Answering (VQA) has been a popular task that combines vision and language, with numerous relevant implementations in literature. Even though there are some attempts that approach explainability and robustness issues in VQA models, very few of them employ counterfactuals as a means of probing such challenges in a model-agnostic way. In this work, we propose a systematic method for explaining the behavior and investigating the robustness of VQA models through counterfactual perturbations. For this reason, we exploit structured knowledge bases to perform deterministic, optimal and controllable word-level replacements targeting the linguistic modality, and we then evaluate the model's response against such counterfactual inputs. Finally, we qualitatively extract local and global explanations based on counterfactual responses, which are ultimately proven insightful towards interpreting VQA model behaviors. By performing a variety of perturbation types, targeting different parts of speech of the input question, we gain insights to the reasoning of the model, through the comparison of its responses in different adversarial circumstances. Overall, we reveal possible biases in the decision-making process of the model, as well as expected and unexpected patterns, which impact its performance quantitatively and qualitatively, as indicated by our analysis.
format	Preprint
id	arxiv_https___arxiv_org_abs_2303_02601
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Knowledge-Based Counterfactual Queries for Visual Question Answering Stoikou, Theodoti Lymperaiou, Maria Stamou, Giorgos Computation and Language Visual Question Answering (VQA) has been a popular task that combines vision and language, with numerous relevant implementations in literature. Even though there are some attempts that approach explainability and robustness issues in VQA models, very few of them employ counterfactuals as a means of probing such challenges in a model-agnostic way. In this work, we propose a systematic method for explaining the behavior and investigating the robustness of VQA models through counterfactual perturbations. For this reason, we exploit structured knowledge bases to perform deterministic, optimal and controllable word-level replacements targeting the linguistic modality, and we then evaluate the model's response against such counterfactual inputs. Finally, we qualitatively extract local and global explanations based on counterfactual responses, which are ultimately proven insightful towards interpreting VQA model behaviors. By performing a variety of perturbation types, targeting different parts of speech of the input question, we gain insights to the reasoning of the model, through the comparison of its responses in different adversarial circumstances. Overall, we reveal possible biases in the decision-making process of the model, as well as expected and unexpected patterns, which impact its performance quantitatively and qualitatively, as indicated by our analysis.
title	Knowledge-Based Counterfactual Queries for Visual Question Answering
topic	Computation and Language
url	https://arxiv.org/abs/2303.02601

Similar Items