Saved in:
Bibliographic Details
Main Author: Chopra, Sahil
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.05727
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929206702637056
author Chopra, Sahil
author_facet Chopra, Sahil
contents Part of speech tagging in zero-resource settings can be an effective approach for low-resource languages when no labeled training data is available. Existing systems use two main techniques for POS tagging i.e. pretrained multilingual large language models(LLM) or project the source language labels into the zero resource target language and train a sequence labeling model on it. We explore the latter approach using the off-the-shelf alignment module and train a hidden Markov model(HMM) to predict the POS tags. We evaluate transfer learning setup with English as a source language and French, German, and Spanish as target languages for part-of-speech tagging. Our conclusion is that projected alignment data in zero-resource language can be beneficial to predict POS tags.
format Preprint
id arxiv_https___arxiv_org_abs_2401_05727
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Zero Resource Cross-Lingual Part Of Speech Tagging
Chopra, Sahil
Computation and Language
Part of speech tagging in zero-resource settings can be an effective approach for low-resource languages when no labeled training data is available. Existing systems use two main techniques for POS tagging i.e. pretrained multilingual large language models(LLM) or project the source language labels into the zero resource target language and train a sequence labeling model on it. We explore the latter approach using the off-the-shelf alignment module and train a hidden Markov model(HMM) to predict the POS tags. We evaluate transfer learning setup with English as a source language and French, German, and Spanish as target languages for part-of-speech tagging. Our conclusion is that projected alignment data in zero-resource language can be beneficial to predict POS tags.
title Zero Resource Cross-Lingual Part Of Speech Tagging
topic Computation and Language
url https://arxiv.org/abs/2401.05727