:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Castilho, Sheila, Fitzsimmons, Zoe, Holton, Claire, Donagh, Aoife Mc
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2504.07680
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Context-Aware Monolingual Human Evaluation of Machine Translation
by: Picinini, Silvio, et al.
Published: (2025)

Critical Confabulation: Can LLMs Hallucinate for Social Good?
by: Sui, Peiqi, et al.
Published: (2025)

Confabulation: The Surprising Value of Large Language Model Hallucinations
by: Sui, Peiqi, et al.
Published: (2024)

Audio-Based Crowd-Sourced Evaluation of Machine Translation Quality
by: Haq, Sami Ul, et al.
Published: (2025)

Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection
by: Gamba, Federica, et al.
Published: (2025)

Anchored Confabulation: Partial Evidence Non-Monotonically Amplifies Confident Hallucination in LLMs
by: Lathkar, Ashish Balkishan
Published: (2026)

Extending CREAMT: Leveraging Large Language Models for Literary Translation Post-Editing
by: Castaldo, Antonio, et al.
Published: (2025)

Fluency and Faithfulness in Human and Machine Literary Translation
by: Griebel, Sarah, et al.
Published: (2026)

Simpson's Paradox and the Accuracy-Fluency Tradeoff in Translation
by: Lim, Zheng Wei, et al.
Published: (2024)

Data-Efficient Domain Adaptation for LLM-based MT using Contrastive Preference Optimization
by: Vieira, Inacio, et al.
Published: (2025)

Translating Under Pressure: Domain-Aware LLMs for Crisis Communication
by: Castaldo, Antonio, et al.
Published: (2026)

How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes
by: Vieira, Inacio, et al.
Published: (2024)

Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task
by: Qiu, Mengyang, et al.
Published: (2025)

TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation
by: Ouyang, Jialin
Published: (2025)

Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and Impersonation
by: Baltaji, Razan, et al.
Published: (2024)

OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection
by: Huang, Chenyang, et al.
Published: (2024)

ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations
by: Wang, Yindong, et al.
Published: (2025)

LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
by: Zebaze, Armel, et al.
Published: (2025)

Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training
by: Keisha, Figarri, et al.
Published: (2025)

Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
by: Zhou, Tianyi, et al.
Published: (2025)

On the Hallucination in Simultaneous Machine Translation
by: Zhong, Meizhi, et al.
Published: (2024)

LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context
by: Yamashita, Natsuo, et al.
Published: (2025)

Fairness or Fluency? An Investigation into Language Bias of Pairwise LLM-as-a-Judge
by: Zhou, Xiaolin, et al.
Published: (2026)

SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish
by: McGiff, Josh, et al.
Published: (2025)

Long-context Reference-based MT Quality Estimation
by: Haq, Sami Ul, et al.
Published: (2025)

Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency
by: Smith, Matthew L., et al.
Published: (2026)

ECO Decoding: Entropy-Based Control for Controllability and Fluency in Controllable Dialogue Generation
by: Shin, Seungmin, et al.
Published: (2025)

Span-Level Hallucination Detection for LLM-Generated Answers
by: Elchafei, Passant, et al.
Published: (2025)

Automatically Generating Chinese Homophone Words to Probe Machine Translation Estimation Systems
by: Qian, Shenbin, et al.
Published: (2025)

Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models
by: Stogiannidis, Ilias, et al.
Published: (2025)

Banishing LLM Hallucinations Requires Rethinking Generalization
by: Li, Johnny, et al.
Published: (2024)

Word Alignment as Preference for Machine Translation
by: Wu, Qiyu, et al.
Published: (2024)

Feeding Two Birds or Favoring One? Adequacy-Fluency Tradeoffs in Evaluation and Meta-Evaluation of Machine Translation
by: Shayegh, Behzad, et al.
Published: (2025)

MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection
by: Borra, Federico, et al.
Published: (2024)

Word Level Timestamp Generation for Automatic Speech Recognition and Translation
by: Hu, Ke, et al.
Published: (2025)

Machine Translation in the Covid domain: an English-Irish case study for LoResMT 2021
by: Lankford, Séamus, et al.
Published: (2024)

Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
by: Sun, Yuhong, et al.
Published: (2024)

LLM Hallucination Detection: HSAD
by: Li, JinXin, et al.
Published: (2025)

InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers
by: Yehuda, Yakir, et al.
Published: (2024)

Synthetic Dataset Creation and Fine-Tuning of Transformer Models for Question Answering in Serbian
by: Cvetanović, Aleksa, et al.
Published: (2024)