Saved in:
Bibliographic Details
Main Authors: Gao, Xiang, Zhang, Jiaxin, Mouatadid, Lalla, Das, Kamalika
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.02509
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910353965711360
author Gao, Xiang
Zhang, Jiaxin
Mouatadid, Lalla
Das, Kamalika
author_facet Gao, Xiang
Zhang, Jiaxin
Mouatadid, Lalla
Das, Kamalika
contents In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50\% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs.
format Preprint
id arxiv_https___arxiv_org_abs_2403_02509
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
Gao, Xiang
Zhang, Jiaxin
Mouatadid, Lalla
Das, Kamalika
Computation and Language
Artificial Intelligence
In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50\% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs.
title SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2403.02509