Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Hunter, Tim
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.12271
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929546097328128
author	Hunter, Tim
author_facet	Hunter, Tim
contents	A central goal of linguistic theory is to find a precise characterization of the notion "possible human language", in the form of a computational device that is capable of describing all and only the languages that can be acquired by a typically developing human child. The success of recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal. This would only be the case if, in addition to succeeding in learning human languages, LLMs struggle to learn "impossible" human languages. Kallini et al. (2024; "Mission: Impossible Language Models", Proc. ACL) conducted experiments aiming to test this by training GPT-2 on a variety of synthetic languages, and found that it learns some more successfully than others. They present these asymmetries as support for the idea that LLMs' inductive biases align with what is regarded as "possible" for human languages, but the most significant comparison has a confound that makes this conclusion unwarranted. In this paper I explain the confound and suggest some ways forward towards constructing a comparison that appropriately tests the underlying issue.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_12271
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Kallini et al. (2024) do not compare impossible languages with constituency-based ones Hunter, Tim Computation and Language Artificial Intelligence A central goal of linguistic theory is to find a precise characterization of the notion "possible human language", in the form of a computational device that is capable of describing all and only the languages that can be acquired by a typically developing human child. The success of recent large language models (LLMs) in NLP applications arguably raises the possibility that LLMs might be computational devices that meet this goal. This would only be the case if, in addition to succeeding in learning human languages, LLMs struggle to learn "impossible" human languages. Kallini et al. (2024; "Mission: Impossible Language Models", Proc. ACL) conducted experiments aiming to test this by training GPT-2 on a variety of synthetic languages, and found that it learns some more successfully than others. They present these asymmetries as support for the idea that LLMs' inductive biases align with what is regarded as "possible" for human languages, but the most significant comparison has a confound that makes this conclusion unwarranted. In this paper I explain the confound and suggest some ways forward towards constructing a comparison that appropriately tests the underlying issue.
title	Kallini et al. (2024) do not compare impossible languages with constituency-based ones
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2410.12271

Similar Items