Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Taji, Dima, Zeman, Daniel
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2503.09417
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908276591951872
author	Taji, Dima Zeman, Daniel
author_facet	Taji, Dima Zeman, Daniel
contents	Training models that can perform well on various NLP tasks require large amounts of data, and this becomes more apparent with nuanced tasks such as anaphora and conference resolution. To combat the prohibitive costs of creating manual gold annotated data, this paper explores two methods to automatically create datasets with coreferential annotations; direct conversion from existing datasets, and parsing using multilingual models capable of handling new and unseen languages. The paper details the current progress on those two fronts, as well as the challenges the efforts currently face, and our approach to overcoming these challenges.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_09417
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Towards Generating Automatic Anaphora Annotations Taji, Dima Zeman, Daniel Computation and Language Training models that can perform well on various NLP tasks require large amounts of data, and this becomes more apparent with nuanced tasks such as anaphora and conference resolution. To combat the prohibitive costs of creating manual gold annotated data, this paper explores two methods to automatically create datasets with coreferential annotations; direct conversion from existing datasets, and parsing using multilingual models capable of handling new and unseen languages. The paper details the current progress on those two fronts, as well as the challenges the efforts currently face, and our approach to overcoming these challenges.
title	Towards Generating Automatic Anaphora Annotations
topic	Computation and Language
url	https://arxiv.org/abs/2503.09417

Similar Items