Saved in:
Bibliographic Details
Main Author: Hunter, Tim
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.12271
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929546097328128
author Hunter, Tim
author_facet Hunter, Tim
contents A central goal of linguistic theory is to find a precise characterization of the notion "possible human language", in the form of a computational device that is capable of describing all and only the languages that can be acquired by a typically developing human child. The success of recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal. This would only be the case if, in addition to succeeding in learning human languages, LLMs struggle to learn "impossible" human languages. Kallini et al. (2024; "Mission: Impossible Language Models", Proc. ACL) conducted experiments aiming to test this by training GPT-2 on a variety of synthetic languages, and found that it learns some more successfully than others. They present these asymmetries as support for the idea that LLMs' inductive biases align with what is regarded as "possible" for human languages, but the most significant comparison has a confound that makes this conclusion unwarranted. In this paper I explain the confound and suggest some ways forward towards constructing a comparison that appropriately tests the underlying issue.
format Preprint
id arxiv_https___arxiv_org_abs_2410_12271
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Kallini et al. (2024) do not compare impossible languages with constituency-based ones
Hunter, Tim
Computation and Language
Artificial Intelligence
A central goal of linguistic theory is to find a precise characterization of the notion "possible human language", in the form of a computational device that is capable of describing all and only the languages that can be acquired by a typically developing human child. The success of recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal. This would only be the case if, in addition to succeeding in learning human languages, LLMs struggle to learn "impossible" human languages. Kallini et al. (2024; "Mission: Impossible Language Models", Proc. ACL) conducted experiments aiming to test this by training GPT-2 on a variety of synthetic languages, and found that it learns some more successfully than others. They present these asymmetries as support for the idea that LLMs' inductive biases align with what is regarded as "possible" for human languages, but the most significant comparison has a confound that makes this conclusion unwarranted. In this paper I explain the confound and suggest some ways forward towards constructing a comparison that appropriately tests the underlying issue.
title Kallini et al. (2024) do not compare impossible languages with constituency-based ones
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2410.12271