Saved in:
Bibliographic Details
Main Authors: Vamvas, Jannis, Prat, Ignacio Pérez, Soliva, Not Battesta, Baltermia-Guetg, Sandra, Beeli, Andrina, Beeli, Simona, Capeder, Madlaina, Decurtins, Laura, Gregori, Gian Peder, Hobi, Flavia, Holderegger, Gabriela, Lazzarini, Arina, Lazzarini, Viviana, Rosselli, Walter, Vital, Bettina, Rutkiewicz, Anna, Sennrich, Rico
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.03148
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916966514556928
author Vamvas, Jannis
Prat, Ignacio Pérez
Soliva, Not Battesta
Baltermia-Guetg, Sandra
Beeli, Andrina
Beeli, Simona
Capeder, Madlaina
Decurtins, Laura
Gregori, Gian Peder
Hobi, Flavia
Holderegger, Gabriela
Lazzarini, Arina
Lazzarini, Viviana
Rosselli, Walter
Vital, Bettina
Rutkiewicz, Anna
Sennrich, Rico
author_facet Vamvas, Jannis
Prat, Ignacio Pérez
Soliva, Not Battesta
Baltermia-Guetg, Sandra
Beeli, Andrina
Beeli, Simona
Capeder, Madlaina
Decurtins, Laura
Gregori, Gian Peder
Hobi, Flavia
Holderegger, Gabriela
Lazzarini, Arina
Lazzarini, Viviana
Rosselli, Walter
Vital, Bettina
Rutkiewicz, Anna
Sennrich, Rico
contents The Romansh language, spoken in Switzerland, has limited resources for machine translation evaluation. In this paper, we present a benchmark for six varieties of Romansh: Rumantsch Grischun, a supra-regional variety, and five regional varieties: Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader. Our reference translations were created by human translators based on the WMT24++ benchmark, which ensures parallelism with more than 55 other languages. An automatic evaluation of existing MT systems and LLMs shows that translation out of Romansh into German is handled relatively well for all the varieties, but translation into Romansh is still challenging.
format Preprint
id arxiv_https___arxiv_org_abs_2509_03148
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader
Vamvas, Jannis
Prat, Ignacio Pérez
Soliva, Not Battesta
Baltermia-Guetg, Sandra
Beeli, Andrina
Beeli, Simona
Capeder, Madlaina
Decurtins, Laura
Gregori, Gian Peder
Hobi, Flavia
Holderegger, Gabriela
Lazzarini, Arina
Lazzarini, Viviana
Rosselli, Walter
Vital, Bettina
Rutkiewicz, Anna
Sennrich, Rico
Computation and Language
The Romansh language, spoken in Switzerland, has limited resources for machine translation evaluation. In this paper, we present a benchmark for six varieties of Romansh: Rumantsch Grischun, a supra-regional variety, and five regional varieties: Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader. Our reference translations were created by human translators based on the WMT24++ benchmark, which ensures parallelism with more than 55 other languages. An automatic evaluation of existing MT systems and LLMs shows that translation out of Romansh into German is handled relatively well for all the varieties, but translation into Romansh is still challenging.
title Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader
topic Computation and Language
url https://arxiv.org/abs/2509.03148