Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Duan, Hanyu, Yang, Yi, Tam, Kar Yan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2402.09733
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914679798890496
author	Duan, Hanyu Yang, Yi Tam, Kar Yan
author_facet	Duan, Hanyu Yang, Yi Tam, Kar Yan
contents	Large Language Models (LLMs) can make up answers that are not real, and this is known as hallucination. This research aims to see if, how, and to what extent LLMs are aware of hallucination. More specifically, we check whether and how an LLM reacts differently in its hidden states when it answers a question right versus when it hallucinates. To do this, we introduce an experimental framework which allows examining LLM's hidden states in different hallucination situations. Building upon this framework, we conduct a series of experiments with language models in the LLaMA family (Touvron et al., 2023). Our empirical findings suggest that LLMs react differently when processing a genuine response versus a fabricated one. We then apply various model interpretation techniques to help understand and explain the findings better. Moreover, informed by the empirical observations, we show great potential of using the guidance derived from LLM's hidden representation space to mitigate hallucination. We believe this work provides insights into how LLMs produce hallucinated answers and how to make them occur less often.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_09733
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States Duan, Hanyu Yang, Yi Tam, Kar Yan Computation and Language Large Language Models (LLMs) can make up answers that are not real, and this is known as hallucination. This research aims to see if, how, and to what extent LLMs are aware of hallucination. More specifically, we check whether and how an LLM reacts differently in its hidden states when it answers a question right versus when it hallucinates. To do this, we introduce an experimental framework which allows examining LLM's hidden states in different hallucination situations. Building upon this framework, we conduct a series of experiments with language models in the LLaMA family (Touvron et al., 2023). Our empirical findings suggest that LLMs react differently when processing a genuine response versus a fabricated one. We then apply various model interpretation techniques to help understand and explain the findings better. Moreover, informed by the empirical observations, we show great potential of using the guidance derived from LLM's hidden representation space to mitigate hallucination. We believe this work provides insights into how LLMs produce hallucinated answers and how to make them occur less often.
title	Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
topic	Computation and Language
url	https://arxiv.org/abs/2402.09733

Similar Items