Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhao, Qinghua, Ravishankar, Vinit, Garneau, Nicolas, Søgaard, Anders
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.00876
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914699419844608
author	Zhao, Qinghua Ravishankar, Vinit Garneau, Nicolas Søgaard, Anders
author_facet	Zhao, Qinghua Ravishankar, Vinit Garneau, Nicolas Søgaard, Anders
contents	Word order is an important concept in natural language, and in this work, we study how word order affects the induction of world knowledge from raw text using language models. We use word analogies to probe for such knowledge. Specifically, in addition to the natural word order, we first respectively extract texts of six fixed word orders from five languages and then pretrain the language models on these texts. Finally, we analyze the experimental results of the fixed word orders on word analogies and show that i) certain fixed word orders consistently outperform or underperform others, though the specifics vary across languages, and ii) the Wov2Lex hypothesis is not hold in pre-trained language models, and the natural word order typically yields mediocre results. The source code will be made publicly available at https://github.com/lshowway/probing_by_analogy.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_00876
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Word Order and World Knowledge Zhao, Qinghua Ravishankar, Vinit Garneau, Nicolas Søgaard, Anders Computation and Language Artificial Intelligence Word order is an important concept in natural language, and in this work, we study how word order affects the induction of world knowledge from raw text using language models. We use word analogies to probe for such knowledge. Specifically, in addition to the natural word order, we first respectively extract texts of six fixed word orders from five languages and then pretrain the language models on these texts. Finally, we analyze the experimental results of the fixed word orders on word analogies and show that i) certain fixed word orders consistently outperform or underperform others, though the specifics vary across languages, and ii) the Wov2Lex hypothesis is not hold in pre-trained language models, and the natural word order typically yields mediocre results. The source code will be made publicly available at https://github.com/lshowway/probing_by_analogy.
title	Word Order and World Knowledge
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2403.00876

Similar Items