Saved in:
| Main Authors: | Castilho, Sheila, Fitzsimmons, Zoe, Holton, Claire, Donagh, Aoife Mc |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.07680 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Context-Aware Monolingual Human Evaluation of Machine Translation
by: Picinini, Silvio, et al.
Published: (2025)
by: Picinini, Silvio, et al.
Published: (2025)
Critical Confabulation: Can LLMs Hallucinate for Social Good?
by: Sui, Peiqi, et al.
Published: (2025)
by: Sui, Peiqi, et al.
Published: (2025)
Confabulation: The Surprising Value of Large Language Model Hallucinations
by: Sui, Peiqi, et al.
Published: (2024)
by: Sui, Peiqi, et al.
Published: (2024)
Audio-Based Crowd-Sourced Evaluation of Machine Translation Quality
by: Haq, Sami Ul, et al.
Published: (2025)
by: Haq, Sami Ul, et al.
Published: (2025)
Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection
by: Gamba, Federica, et al.
Published: (2025)
by: Gamba, Federica, et al.
Published: (2025)
Anchored Confabulation: Partial Evidence Non-Monotonically Amplifies Confident Hallucination in LLMs
by: Lathkar, Ashish Balkishan
Published: (2026)
by: Lathkar, Ashish Balkishan
Published: (2026)
Extending CREAMT: Leveraging Large Language Models for Literary Translation Post-Editing
by: Castaldo, Antonio, et al.
Published: (2025)
by: Castaldo, Antonio, et al.
Published: (2025)
Fluency and Faithfulness in Human and Machine Literary Translation
by: Griebel, Sarah, et al.
Published: (2026)
by: Griebel, Sarah, et al.
Published: (2026)
Simpson's Paradox and the Accuracy-Fluency Tradeoff in Translation
by: Lim, Zheng Wei, et al.
Published: (2024)
by: Lim, Zheng Wei, et al.
Published: (2024)
Data-Efficient Domain Adaptation for LLM-based MT using Contrastive Preference Optimization
by: Vieira, Inacio, et al.
Published: (2025)
by: Vieira, Inacio, et al.
Published: (2025)
Translating Under Pressure: Domain-Aware LLMs for Crisis Communication
by: Castaldo, Antonio, et al.
Published: (2026)
by: Castaldo, Antonio, et al.
Published: (2026)
How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes
by: Vieira, Inacio, et al.
Published: (2024)
by: Vieira, Inacio, et al.
Published: (2024)
Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task
by: Qiu, Mengyang, et al.
Published: (2025)
by: Qiu, Mengyang, et al.
Published: (2025)
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation
by: Ouyang, Jialin
Published: (2025)
by: Ouyang, Jialin
Published: (2025)
Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and Impersonation
by: Baltaji, Razan, et al.
Published: (2024)
by: Baltaji, Razan, et al.
Published: (2024)
OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection
by: Huang, Chenyang, et al.
Published: (2024)
by: Huang, Chenyang, et al.
Published: (2024)
ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations
by: Wang, Yindong, et al.
Published: (2025)
by: Wang, Yindong, et al.
Published: (2025)
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
by: Zebaze, Armel, et al.
Published: (2025)
by: Zebaze, Armel, et al.
Published: (2025)
Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training
by: Keisha, Figarri, et al.
Published: (2025)
by: Keisha, Figarri, et al.
Published: (2025)
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
by: Zhou, Tianyi, et al.
Published: (2025)
by: Zhou, Tianyi, et al.
Published: (2025)
On the Hallucination in Simultaneous Machine Translation
by: Zhong, Meizhi, et al.
Published: (2024)
by: Zhong, Meizhi, et al.
Published: (2024)
LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context
by: Yamashita, Natsuo, et al.
Published: (2025)
by: Yamashita, Natsuo, et al.
Published: (2025)
Fairness or Fluency? An Investigation into Language Bias of Pairwise LLM-as-a-Judge
by: Zhou, Xiaolin, et al.
Published: (2026)
by: Zhou, Xiaolin, et al.
Published: (2026)
SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish
by: McGiff, Josh, et al.
Published: (2025)
by: McGiff, Josh, et al.
Published: (2025)
Long-context Reference-based MT Quality Estimation
by: Haq, Sami Ul, et al.
Published: (2025)
by: Haq, Sami Ul, et al.
Published: (2025)
Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency
by: Smith, Matthew L., et al.
Published: (2026)
by: Smith, Matthew L., et al.
Published: (2026)
ECO Decoding: Entropy-Based Control for Controllability and Fluency in Controllable Dialogue Generation
by: Shin, Seungmin, et al.
Published: (2025)
by: Shin, Seungmin, et al.
Published: (2025)
Span-Level Hallucination Detection for LLM-Generated Answers
by: Elchafei, Passant, et al.
Published: (2025)
by: Elchafei, Passant, et al.
Published: (2025)
Automatically Generating Chinese Homophone Words to Probe Machine Translation Estimation Systems
by: Qian, Shenbin, et al.
Published: (2025)
by: Qian, Shenbin, et al.
Published: (2025)
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models
by: Stogiannidis, Ilias, et al.
Published: (2025)
by: Stogiannidis, Ilias, et al.
Published: (2025)
Banishing LLM Hallucinations Requires Rethinking Generalization
by: Li, Johnny, et al.
Published: (2024)
by: Li, Johnny, et al.
Published: (2024)
Word Alignment as Preference for Machine Translation
by: Wu, Qiyu, et al.
Published: (2024)
by: Wu, Qiyu, et al.
Published: (2024)
Feeding Two Birds or Favoring One? Adequacy-Fluency Tradeoffs in Evaluation and Meta-Evaluation of Machine Translation
by: Shayegh, Behzad, et al.
Published: (2025)
by: Shayegh, Behzad, et al.
Published: (2025)
MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection
by: Borra, Federico, et al.
Published: (2024)
by: Borra, Federico, et al.
Published: (2024)
Word Level Timestamp Generation for Automatic Speech Recognition and Translation
by: Hu, Ke, et al.
Published: (2025)
by: Hu, Ke, et al.
Published: (2025)
Machine Translation in the Covid domain: an English-Irish case study for LoResMT 2021
by: Lankford, Séamus, et al.
Published: (2024)
by: Lankford, Séamus, et al.
Published: (2024)
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
by: Sun, Yuhong, et al.
Published: (2024)
by: Sun, Yuhong, et al.
Published: (2024)
LLM Hallucination Detection: HSAD
by: Li, JinXin, et al.
Published: (2025)
by: Li, JinXin, et al.
Published: (2025)
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers
by: Yehuda, Yakir, et al.
Published: (2024)
by: Yehuda, Yakir, et al.
Published: (2024)
Synthetic Dataset Creation and Fine-Tuning of Transformer Models for Question Answering in Serbian
by: Cvetanović, Aleksa, et al.
Published: (2024)
by: Cvetanović, Aleksa, et al.
Published: (2024)
Similar Items
-
Context-Aware Monolingual Human Evaluation of Machine Translation
by: Picinini, Silvio, et al.
Published: (2025) -
Critical Confabulation: Can LLMs Hallucinate for Social Good?
by: Sui, Peiqi, et al.
Published: (2025) -
Confabulation: The Surprising Value of Large Language Model Hallucinations
by: Sui, Peiqi, et al.
Published: (2024) -
Audio-Based Crowd-Sourced Evaluation of Machine Translation Quality
by: Haq, Sami Ul, et al.
Published: (2025) -
Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection
by: Gamba, Federica, et al.
Published: (2025)