Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Freeman, Brian, Kicklighter, Adam, Erdman, Matt, Gordon, Zach
Format:	Preprint
Published:	2026
Subjects:	Software Engineering Artificial Intelligence Human-Computer Interaction
Online Access:	https://arxiv.org/abs/2603.10047
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908938769793024
author	Freeman, Brian Kicklighter, Adam Erdman, Matt Gordon, Zach
author_facet	Freeman, Brian Kicklighter, Adam Erdman, Matt Gordon, Zach
contents	Hallucinations in large language models (LLMs) are outputs that are syntactically coherent but factually incorrect or contextually inconsistent. They are persistent obstacles in high-stakes industrial settings such as engineering design, enterprise resource planning, and IoT telemetry platforms. We present and compare five prompt engineering strategies intended to reduce the variance of model outputs and move toward repeatable, grounded results without modifying model weights or creating complex validation models. These methods include: (M1) Iterative Similarity Convergence, (M2) Decomposed Model-Agnostic Prompting, (M3) Single-Task Agent Specialization, (M4) Enhanced Data Registry, and (M5) Domain Glossary Injection. Each method is evaluated against an internal baseline using an LLM-as-Judge framework over 100 repeated runs per method (same fixed task prompt, stochastic decoding at tau = 0.7. Under this evaluation setup, M4 (Enhanced Data Registry) received ``Better'' verdicts in all 100 trials; M3 and M5 reached 80% and 77% respectively; M1 reached 75%; and M2 was net negative at 34% when compared to single shot prompting with a modern foundation model. We then developed enhanced version 2 (v2) implementations and assessed them on a 10-trial verification batch; M2 recovered from 34% to 80%, the largest gain among the four revised methods. We discuss how these strategies help overcome the non-deterministic nature of LLM results for industrial procedures, even when absolute correctness cannot be guaranteed. We provide pseudocode, verbatim prompts, and batch logs to support independent assessment.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_10047
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction Freeman, Brian Kicklighter, Adam Erdman, Matt Gordon, Zach Software Engineering Artificial Intelligence Human-Computer Interaction Hallucinations in large language models (LLMs) are outputs that are syntactically coherent but factually incorrect or contextually inconsistent. They are persistent obstacles in high-stakes industrial settings such as engineering design, enterprise resource planning, and IoT telemetry platforms. We present and compare five prompt engineering strategies intended to reduce the variance of model outputs and move toward repeatable, grounded results without modifying model weights or creating complex validation models. These methods include: (M1) Iterative Similarity Convergence, (M2) Decomposed Model-Agnostic Prompting, (M3) Single-Task Agent Specialization, (M4) Enhanced Data Registry, and (M5) Domain Glossary Injection. Each method is evaluated against an internal baseline using an LLM-as-Judge framework over 100 repeated runs per method (same fixed task prompt, stochastic decoding at tau = 0.7. Under this evaluation setup, M4 (Enhanced Data Registry) received ``Better'' verdicts in all 100 trials; M3 and M5 reached 80% and 77% respectively; M1 reached 75%; and M2 was net negative at 34% when compared to single shot prompting with a modern foundation model. We then developed enhanced version 2 (v2) implementations and assessed them on a 10-trial verification batch; M2 recovered from 34% to 80%, the largest gain among the four revised methods. We discuss how these strategies help overcome the non-deterministic nature of LLM results for industrial procedures, even when absolute correctness cannot be guaranteed. We provide pseudocode, verbatim prompts, and batch logs to support independent assessment.
title	Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction
topic	Software Engineering Artificial Intelligence Human-Computer Interaction
url	https://arxiv.org/abs/2603.10047

Similar Items