Saved in:
Bibliographic Details
Main Authors: Hikal, Baraa, Nasreldin, Ahmed, Hamdi, Ali, Mohammed, Ammar
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.16616
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910802348343296
author Hikal, Baraa
Nasreldin, Ahmed
Hamdi, Ali
Mohammed, Ammar
author_facet Hikal, Baraa
Nasreldin, Ahmed
Hamdi, Ali
Mohammed, Ammar
contents Hallucination detection in text generation remains an ongoing struggle for natural language processing (NLP) systems, frequently resulting in unreliable outputs in applications such as machine translation and definition modeling. Existing methods struggle with data scarcity and the limitations of unlabeled datasets, as highlighted by the SHROOM shared task at SemEval-2024. In this work, we propose a novel framework to address these challenges, introducing DeepSeek Few-shot optimization to enhance weak label generation through iterative prompt engineering. We achieved high-quality annotations that considerably enhanced the performance of downstream models by restructuring data to align with instruct generative models. We further fine-tuned the Mistral-7B-Instruct-v0.3 model on these optimized annotations, enabling it to accurately detect hallucinations in resource-limited settings. Combining this fine-tuned model with ensemble learning strategies, our approach achieved 85.5% accuracy on the test set, setting a new benchmark for the SHROOM task. This study demonstrates the effectiveness of data restructuring, few-shot optimization, and fine-tuning in building scalable and robust hallucination detection frameworks for resource-constrained NLP systems.
format Preprint
id arxiv_https___arxiv_org_abs_2501_16616
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Few-Shot Optimized Framework for Hallucination Detection in Resource-Limited NLP Systems
Hikal, Baraa
Nasreldin, Ahmed
Hamdi, Ali
Mohammed, Ammar
Computation and Language
Hallucination detection in text generation remains an ongoing struggle for natural language processing (NLP) systems, frequently resulting in unreliable outputs in applications such as machine translation and definition modeling. Existing methods struggle with data scarcity and the limitations of unlabeled datasets, as highlighted by the SHROOM shared task at SemEval-2024. In this work, we propose a novel framework to address these challenges, introducing DeepSeek Few-shot optimization to enhance weak label generation through iterative prompt engineering. We achieved high-quality annotations that considerably enhanced the performance of downstream models by restructuring data to align with instruct generative models. We further fine-tuned the Mistral-7B-Instruct-v0.3 model on these optimized annotations, enabling it to accurately detect hallucinations in resource-limited settings. Combining this fine-tuned model with ensemble learning strategies, our approach achieved 85.5% accuracy on the test set, setting a new benchmark for the SHROOM task. This study demonstrates the effectiveness of data restructuring, few-shot optimization, and fine-tuning in building scalable and robust hallucination detection frameworks for resource-constrained NLP systems.
title Few-Shot Optimized Framework for Hallucination Detection in Resource-Limited NLP Systems
topic Computation and Language
url https://arxiv.org/abs/2501.16616