Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gao, Xiang, Zhang, Jiaxin, Mouatadid, Lalla, Das, Kamalika
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.02509
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910353965711360
author	Gao, Xiang Zhang, Jiaxin Mouatadid, Lalla Das, Kamalika
author_facet	Gao, Xiang Zhang, Jiaxin Mouatadid, Lalla Das, Kamalika
contents	In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50\% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_02509
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models Gao, Xiang Zhang, Jiaxin Mouatadid, Lalla Das, Kamalika Computation and Language Artificial Intelligence In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50\% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs.
title	SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2403.02509

Similar Items