Saved in:
Bibliographic Details
Main Authors: Li, Rumeng, Wang, Xun, Yu, Hong
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.18607
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915463230914560
author Li, Rumeng
Wang, Xun
Yu, Hong
author_facet Li, Rumeng
Wang, Xun
Yu, Hong
contents Translating electronic health record (EHR) narratives from English to Spanish is a clinically important yet challenging task due to the lack of a parallel-aligned corpus and the abundant unknown words contained. To address such challenges, we propose \textbf{NOOV} (for No OOV), a new neural machine translation (NMT) system that requires little in-domain parallel-aligned corpus for training. NOOV integrates a bilingual lexicon automatically learned from parallel-aligned corpora and a phrase look-up table extracted from a large biomedical knowledge resource, to alleviate both the unknown word problem and the word-repeat challenge in NMT, enhancing better phrase generation of NMT systems. Evaluation shows that NOOV is able to generate better translation of EHR with improvement in both accuracy and fluency.
format Preprint
id arxiv_https___arxiv_org_abs_2508_18607
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A New NMT Model for Translating Clinical Texts from English to Spanish
Li, Rumeng
Wang, Xun
Yu, Hong
Computation and Language
Translating electronic health record (EHR) narratives from English to Spanish is a clinically important yet challenging task due to the lack of a parallel-aligned corpus and the abundant unknown words contained. To address such challenges, we propose \textbf{NOOV} (for No OOV), a new neural machine translation (NMT) system that requires little in-domain parallel-aligned corpus for training. NOOV integrates a bilingual lexicon automatically learned from parallel-aligned corpora and a phrase look-up table extracted from a large biomedical knowledge resource, to alleviate both the unknown word problem and the word-repeat challenge in NMT, enhancing better phrase generation of NMT systems. Evaluation shows that NOOV is able to generate better translation of EHR with improvement in both accuracy and fluency.
title A New NMT Model for Translating Clinical Texts from English to Spanish
topic Computation and Language
url https://arxiv.org/abs/2508.18607