Saved in:
Bibliographic Details
Main Authors: Chen, Shangfeng, Shi, Xiayang, Li, Pu, Li, Yinlin, Liu, Jingjing
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2411.08348
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Large language models (LLMs) have demonstrated remarkable proficiency in machine translation (MT), even without specific training on the languages in question. However, translating rare words in low-resource or domain-specific contexts remains challenging for LLMs. To address this issue, we propose a multi-step prompt chain that enhances translation faithfulness by prioritizing key terms crucial for semantic accuracy. Our method first identifies these keywords and retrieves their translations from a bilingual dictionary, integrating them into the LLM's context using Retrieval-Augmented Generation (RAG). We further mitigate potential output hallucinations caused by long prompts through an iterative self-checking mechanism, where the LLM refines its translations based on lexical and semantic constraints. Experiments using Llama and Qwen as base models on the FLORES-200 and WMT datasets demonstrate significant improvements over baselines, highlighting the effectiveness of our approach in enhancing translation faithfulness and robustness, particularly in low-resource scenarios.