Saved in:
Bibliographic Details
Main Authors: Chen, Shangfeng, Shi, Xiayang, Li, Pu, Li, Yinlin, Liu, Jingjing
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2411.08348
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917835903598592
author Chen, Shangfeng
Shi, Xiayang
Li, Pu
Li, Yinlin
Liu, Jingjing
author_facet Chen, Shangfeng
Shi, Xiayang
Li, Pu
Li, Yinlin
Liu, Jingjing
contents Large language models (LLMs) have demonstrated remarkable proficiency in machine translation (MT), even without specific training on the languages in question. However, translating rare words in low-resource or domain-specific contexts remains challenging for LLMs. To address this issue, we propose a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy. Our method first identifies these keywords and retrieves their translations from a bilingual dictionary, integrating them into the LLM's context using Retrieval-Augmented Generation (RAG). We further mitigate potential output hallucinations caused by long prompts through an iterative self-checking mechanism, where the LLM refines its translations based on lexical and semantic constraints. Experiments using Llama and Qwen as base models on the FLORES-200 and WMT datasets demonstrate significant improvements over baselines, highlighting the effectiveness of our approach in enhancing translation faithfulness and robustness, particularly in low-resource scenarios.
format Preprint
id arxiv_https___arxiv_org_abs_2411_08348
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Refining Translations with LLMs: A Constraint-Aware Iterative Prompting Approach
Chen, Shangfeng
Shi, Xiayang
Li, Pu
Li, Yinlin
Liu, Jingjing
Computation and Language
Large language models (LLMs) have demonstrated remarkable proficiency in machine translation (MT), even without specific training on the languages in question. However, translating rare words in low-resource or domain-specific contexts remains challenging for LLMs. To address this issue, we propose a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy. Our method first identifies these keywords and retrieves their translations from a bilingual dictionary, integrating them into the LLM's context using Retrieval-Augmented Generation (RAG). We further mitigate potential output hallucinations caused by long prompts through an iterative self-checking mechanism, where the LLM refines its translations based on lexical and semantic constraints. Experiments using Llama and Qwen as base models on the FLORES-200 and WMT datasets demonstrate significant improvements over baselines, highlighting the effectiveness of our approach in enhancing translation faithfulness and robustness, particularly in low-resource scenarios.
title Refining Translations with LLMs: A Constraint-Aware Iterative Prompting Approach
topic Computation and Language
url https://arxiv.org/abs/2411.08348