Saved in:
| Main Authors: | Wastl, Michelle, Vamvas, Jannis, Calleri, Selena, Sennrich, Rico |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.21677 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SwissGov-RSD: A Human-annotated, Cross-lingual Benchmark for Token-level Recognition of Semantic Differences Between Related Documents
by: Wastl, Michelle, et al.
Published: (2025)
by: Wastl, Michelle, et al.
Published: (2025)
Machine Translation Models are Zero-Shot Detectors of Translation Direction
by: Wastl, Michelle, et al.
Published: (2024)
by: Wastl, Michelle, et al.
Published: (2024)
SwissBERT: The Multilingual Language Model for Switzerland
by: Vamvas, Jannis, et al.
Published: (2023)
by: Vamvas, Jannis, et al.
Published: (2023)
Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
by: Vamvas, Jannis, et al.
Published: (2024)
by: Vamvas, Jannis, et al.
Published: (2024)
Linear-time Minimum Bayes Risk Decoding with Reference Aggregation
by: Vamvas, Jannis, et al.
Published: (2024)
by: Vamvas, Jannis, et al.
Published: (2024)
The Mediomatix Corpus: Parallel Data for Romansh Language Varieties via Comparable Schoolbooks
by: Hopton, Zachary, et al.
Published: (2025)
by: Hopton, Zachary, et al.
Published: (2025)
Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents
by: Hu, Hanxu, et al.
Published: (2025)
by: Hu, Hanxu, et al.
Published: (2025)
Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding
by: Sennrich, Rico, et al.
Published: (2023)
by: Sennrich, Rico, et al.
Published: (2023)
Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models
by: Mohammadshahi, Alireza, et al.
Published: (2023)
by: Mohammadshahi, Alireza, et al.
Published: (2023)
Leveraging In-Context Learning for Political Bias Testing of LLMs
by: Haller, Patrick, et al.
Published: (2025)
by: Haller, Patrick, et al.
Published: (2025)
QueST: Incentivizing LLMs to Generate Difficult Problems
by: Hu, Hanxu, et al.
Published: (2025)
by: Hu, Hanxu, et al.
Published: (2025)
Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents
by: Grosjean, Juri, et al.
Published: (2024)
by: Grosjean, Juri, et al.
Published: (2024)
DeReason: A Difficulty-Aware Curriculum Improves Decoupled SFT-then-RL Training for General Reasoning
by: Hu, Hanxu, et al.
Published: (2026)
by: Hu, Hanxu, et al.
Published: (2026)
Translation Asymmetry in LLMs as a Data Augmentation Factor: A Case Study for 6 Romansh Language Varieties
by: Vamvas, Jannis, et al.
Published: (2026)
by: Vamvas, Jannis, et al.
Published: (2026)
Robust Language Identification for Romansh Varieties
by: Model, Charlotte, et al.
Published: (2026)
by: Model, Charlotte, et al.
Published: (2026)
RUMLEM: A Dictionary-Based Lemmatizer for Romansh
by: Fischer, Dominic P., et al.
Published: (2026)
by: Fischer, Dominic P., et al.
Published: (2026)
Measuring the Effect of Disfluency in Multilingual Knowledge Probing Benchmarks
by: Semenov, Kirill, et al.
Published: (2025)
by: Semenov, Kirill, et al.
Published: (2025)
Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples
by: Michail, Andrianos, et al.
Published: (2025)
by: Michail, Andrianos, et al.
Published: (2025)
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
by: Kew, Tannon, et al.
Published: (2023)
by: Kew, Tannon, et al.
Published: (2023)
Evaluating Automatic Metrics with Incremental Machine Translation Systems
by: Wu, Guojun, et al.
Published: (2024)
by: Wu, Guojun, et al.
Published: (2024)
SwissGPC v1.0 -- The Swiss German Podcasts Corpus
by: Stucki, Samuel, et al.
Published: (2025)
by: Stucki, Samuel, et al.
Published: (2025)
A Corpus for Sentence-level Subjectivity Detection on English News Articles
by: Antici, Francesco, et al.
Published: (2023)
by: Antici, Francesco, et al.
Published: (2023)
An Analysis of BPE Vocabulary Trimming in Neural Machine Translation
by: Cognetta, Marco, et al.
Published: (2024)
by: Cognetta, Marco, et al.
Published: (2024)
Expanding the WMT24++ Benchmark with Rumantsch Grischun, Sursilvan, Sutsilvan, Surmiran, Puter, and Vallader
by: Vamvas, Jannis, et al.
Published: (2025)
by: Vamvas, Jannis, et al.
Published: (2025)
Robust Native Language Identification through Agentic Decomposition
by: Uluslu, Ahmet Yavuz, et al.
Published: (2025)
by: Uluslu, Ahmet Yavuz, et al.
Published: (2025)
Information Representation Fairness in Long-Document Embeddings: The Peculiar Interaction of Positional and Language Bias
by: Schuhmacher, Elias, et al.
Published: (2026)
by: Schuhmacher, Elias, et al.
Published: (2026)
RAAMove: A Corpus for Analyzing Moves in Research Article Abstracts
by: Li, Hongzheng, et al.
Published: (2024)
by: Li, Hongzheng, et al.
Published: (2024)
Conversational Lexicography: Querying Lexicographic Data on Knowledge Graphs with SPARQL through Natural Language
by: Sennrich, Kilian, et al.
Published: (2025)
by: Sennrich, Kilian, et al.
Published: (2025)
CommonMorph: Participatory Morphological Documentation Platform
by: Mahmudi, Aso, et al.
Published: (2026)
by: Mahmudi, Aso, et al.
Published: (2026)
SignCLIP: Connecting Text and Sign Language by Contrastive Learning
by: Jiang, Zifan, et al.
Published: (2024)
by: Jiang, Zifan, et al.
Published: (2024)
Quantifying Generative Media Bias with a Corpus of Real-world and Generated News Articles
by: Trhlik, Filip, et al.
Published: (2024)
by: Trhlik, Filip, et al.
Published: (2024)
CASIMIR: A Corpus of Scientific Articles enhanced with Multiple Author-Integrated Revisions
by: Jourdan, Leane, et al.
Published: (2024)
by: Jourdan, Leane, et al.
Published: (2024)
Machine Translation Meta Evaluation through Translation Accuracy Challenge Sets
by: Moghe, Nikita, et al.
Published: (2024)
by: Moghe, Nikita, et al.
Published: (2024)
MultimodalHugs: Enabling Sign Language Processing in Hugging Face
by: Sant, Gerard, et al.
Published: (2025)
by: Sant, Gerard, et al.
Published: (2025)
Swiss Parliaments Corpus Re-Imagined (SPC_R): Enhanced Transcription with RAG-based Correction and Predicted BLEU
by: Timmel, Vincenzo, et al.
Published: (2025)
by: Timmel, Vincenzo, et al.
Published: (2025)
Parity-Aware Byte-Pair Encoding: Improving Cross-lingual Fairness in Tokenization
by: Foroutan, Negar, et al.
Published: (2025)
by: Foroutan, Negar, et al.
Published: (2025)
A Topic-aware Comparable Corpus of Chinese Variations
by: Lian, Da-Chen, et al.
Published: (2024)
by: Lian, Da-Chen, et al.
Published: (2024)
Triangulating Temporal Dynamics in Multilingual Swiss Online News
by: Victor, Bros, et al.
Published: (2026)
by: Victor, Bros, et al.
Published: (2026)
A Multilingual Similarity Dataset for News Article Frame
by: Chen, Xi, et al.
Published: (2024)
by: Chen, Xi, et al.
Published: (2024)
Scalable Detection of Salient Entities in News Articles
by: Asgarieh, Eliyar, et al.
Published: (2024)
by: Asgarieh, Eliyar, et al.
Published: (2024)
Similar Items
-
SwissGov-RSD: A Human-annotated, Cross-lingual Benchmark for Token-level Recognition of Semantic Differences Between Related Documents
by: Wastl, Michelle, et al.
Published: (2025) -
Machine Translation Models are Zero-Shot Detectors of Translation Direction
by: Wastl, Michelle, et al.
Published: (2024) -
SwissBERT: The Multilingual Language Model for Switzerland
by: Vamvas, Jannis, et al.
Published: (2023) -
Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
by: Vamvas, Jannis, et al.
Published: (2024) -
Linear-time Minimum Bayes Risk Decoding with Reference Aggregation
by: Vamvas, Jannis, et al.
Published: (2024)