Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Petrov, Daniel
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.02683
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909450916331520
author	Petrov, Daniel
author_facet	Petrov, Daniel
contents	Large-scale pre-trained language models have demonstrated high performance on standard datasets for natural language inference (NLI) tasks. Unfortunately, these evaluations can be misleading, as although the models can perform well on in-distribution data, they perform poorly on out-of-distribution test sets, such as contrast sets. Contrast sets consist of perturbed instances of data that have very minor, but meaningful, changes to the input that alter the gold label, revealing how models can learn superficial patterns in the training data rather than learning more sophisticated language nuances. As an example, the ELECTRA-small language model achieves nearly 90% accuracy on an SNLI dataset but drops to 75% when tested on an out-of-distribution contrast set. The research carried out in this study explores how the robustness of a language model can be improved by exposing it to small amounts of more complex contrast sets during training to help it better learn language patterns. With this approach, the model recovers performance and achieves nearly 90% accuracy on contrast sets, highlighting the importance of diverse and challenging training data.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_02683
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	From Superficial Patterns to Semantic Understanding: Fine-Tuning Language Models on Contrast Sets Petrov, Daniel Computation and Language Artificial Intelligence Large-scale pre-trained language models have demonstrated high performance on standard datasets for natural language inference (NLI) tasks. Unfortunately, these evaluations can be misleading, as although the models can perform well on in-distribution data, they perform poorly on out-of-distribution test sets, such as contrast sets. Contrast sets consist of perturbed instances of data that have very minor, but meaningful, changes to the input that alter the gold label, revealing how models can learn superficial patterns in the training data rather than learning more sophisticated language nuances. As an example, the ELECTRA-small language model achieves nearly 90% accuracy on an SNLI dataset but drops to 75% when tested on an out-of-distribution contrast set. The research carried out in this study explores how the robustness of a language model can be improved by exposing it to small amounts of more complex contrast sets during training to help it better learn language patterns. With this approach, the model recovers performance and achieves nearly 90% accuracy on contrast sets, highlighting the importance of diverse and challenging training data.
title	From Superficial Patterns to Semantic Understanding: Fine-Tuning Language Models on Contrast Sets
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2501.02683

Similar Items