Saved in:
Bibliographic Details
Main Authors: Weissweiler, Leonie, Böbel, Nina, Guiller, Kirian, Herrera, Santiago, Scivetti, Wesley, Lorenzi, Arthur, Melnik, Nurit, Bhatia, Archna, Schütze, Hinrich, Levin, Lori, Zeldes, Amir, Nivre, Joakim, Croft, William, Schneider, Nathan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.17748
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911814361546752
author Weissweiler, Leonie
Böbel, Nina
Guiller, Kirian
Herrera, Santiago
Scivetti, Wesley
Lorenzi, Arthur
Melnik, Nurit
Bhatia, Archna
Schütze, Hinrich
Levin, Lori
Zeldes, Amir
Nivre, Joakim
Croft, William
Schneider, Nathan
author_facet Weissweiler, Leonie
Böbel, Nina
Guiller, Kirian
Herrera, Santiago
Scivetti, Wesley
Lorenzi, Arthur
Melnik, Nurit
Bhatia, Archna
Schütze, Hinrich
Levin, Lori
Zeldes, Amir
Nivre, Joakim
Croft, William
Schneider, Nathan
contents The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labeled holistically. We argue for (i) augmenting UD annotations with a 'UCxn' annotation layer for such meaning-bearing grammatical constructions, and (ii) approaching this in a typologically informed way so that morphosyntactic strategies can be compared across languages. As a case study, we consider five construction families in ten languages, identifying instances of each construction in UD treebanks through the use of morphosyntactic patterns. In addition to findings regarding these particular constructions, our study yields important insights on methodology for describing and identifying constructions in language-general and language-particular ways, and lays the foundation for future constructional enrichment of UD treebanks.
format Preprint
id arxiv_https___arxiv_org_abs_2403_17748
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies
Weissweiler, Leonie
Böbel, Nina
Guiller, Kirian
Herrera, Santiago
Scivetti, Wesley
Lorenzi, Arthur
Melnik, Nurit
Bhatia, Archna
Schütze, Hinrich
Levin, Lori
Zeldes, Amir
Nivre, Joakim
Croft, William
Schneider, Nathan
Computation and Language
The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labeled holistically. We argue for (i) augmenting UD annotations with a 'UCxn' annotation layer for such meaning-bearing grammatical constructions, and (ii) approaching this in a typologically informed way so that morphosyntactic strategies can be compared across languages. As a case study, we consider five construction families in ten languages, identifying instances of each construction in UD treebanks through the use of morphosyntactic patterns. In addition to findings regarding these particular constructions, our study yields important insights on methodology for describing and identifying constructions in language-general and language-particular ways, and lays the foundation for future constructional enrichment of UD treebanks.
title UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies
topic Computation and Language
url https://arxiv.org/abs/2403.17748