Salvato in:
| Autori principali: | , , , |
|---|---|
| Natura: | Preprint |
| Pubblicazione: |
2024
|
| Soggetti: | |
| Accesso online: | https://arxiv.org/abs/2406.11049 |
| Tags: |
Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
|
| _version_ | 1866913393309384704 |
|---|---|
| author | Tanzer, Garrett Shengelia, Maximus Harrenstien, Ken Uthus, David |
| author_facet | Tanzer, Garrett Shengelia, Maximus Harrenstien, Ken Uthus, David |
| contents | Historically, sign language machine translation has been posed as a sentence-level task: datasets consisting of continuous narratives are chopped up and presented to the model as isolated clips. In this work, we explore the limitations of this task framing. First, we survey a number of linguistic phenomena in sign languages that depend on discourse-level context. Then as a case study, we perform the first human baseline for sign language translation that actually substitutes a human into the machine learning task framing, rather than provide the human with the entire document as context. This human baseline -- for ASL to English translation on the How2Sign dataset -- shows that for 33% of sentences in our sample, our fluent Deaf signer annotators were only able to understand key parts of the clip in light of additional discourse-level context. These results underscore the importance of understanding and sanity checking examples when adapting machine learning to new domains. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2406_11049 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Reconsidering Sentence-Level Sign Language Translation Tanzer, Garrett Shengelia, Maximus Harrenstien, Ken Uthus, David Computation and Language Historically, sign language machine translation has been posed as a sentence-level task: datasets consisting of continuous narratives are chopped up and presented to the model as isolated clips. In this work, we explore the limitations of this task framing. First, we survey a number of linguistic phenomena in sign languages that depend on discourse-level context. Then as a case study, we perform the first human baseline for sign language translation that actually substitutes a human into the machine learning task framing, rather than provide the human with the entire document as context. This human baseline -- for ASL to English translation on the How2Sign dataset -- shows that for 33% of sentences in our sample, our fluent Deaf signer annotators were only able to understand key parts of the clip in light of additional discourse-level context. These results underscore the importance of understanding and sanity checking examples when adapting machine learning to new domains. |
| title | Reconsidering Sentence-Level Sign Language Translation |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2406.11049 |