Saved in:
Bibliographic Details
Main Author: Marcus, Ariel
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.10091
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916096752222208
author Marcus, Ariel
author_facet Marcus, Ariel
contents Recent models have achieved human level performance on the Stanford Question Answering Dataset when using F1 scores to evaluate the reading comprehension task. Yet, teaching machines to comprehend text has not been solved in the general case. By appending one adversarial sentence to the context paragraph, past research has shown that the F1 scores from reading comprehension models drop almost in half. In this paper, I replicate past adversarial research with a new model, ELECTRA-Small, and demonstrate that the new model's F1 score drops from 83.9% to 29.2%. To improve ELECTRA-Small's resistance to this attack, I finetune the model on SQuAD v1.1 training examples with one to five adversarial sentences appended to the context paragraph. Like past research, I find that the finetuned model on one adversarial sentence does not generalize well across evaluation datasets. However, when finetuned on four or five adversarial sentences the model attains an F1 score of more than 70% on most evaluation datasets with multiple appended and prepended adversarial sentences. The results suggest that with enough examples we can make models robust to adversarial attacks.
format Preprint
id arxiv_https___arxiv_org_abs_2401_10091
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Power in Numbers: Robust reading comprehension by finetuning with four adversarial sentences per example
Marcus, Ariel
Computation and Language
Recent models have achieved human level performance on the Stanford Question Answering Dataset when using F1 scores to evaluate the reading comprehension task. Yet, teaching machines to comprehend text has not been solved in the general case. By appending one adversarial sentence to the context paragraph, past research has shown that the F1 scores from reading comprehension models drop almost in half. In this paper, I replicate past adversarial research with a new model, ELECTRA-Small, and demonstrate that the new model's F1 score drops from 83.9% to 29.2%. To improve ELECTRA-Small's resistance to this attack, I finetune the model on SQuAD v1.1 training examples with one to five adversarial sentences appended to the context paragraph. Like past research, I find that the finetuned model on one adversarial sentence does not generalize well across evaluation datasets. However, when finetuned on four or five adversarial sentences the model attains an F1 score of more than 70% on most evaluation datasets with multiple appended and prepended adversarial sentences. The results suggest that with enough examples we can make models robust to adversarial attacks.
title Power in Numbers: Robust reading comprehension by finetuning with four adversarial sentences per example
topic Computation and Language
url https://arxiv.org/abs/2401.10091