Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Yuan, Huang, Yue, Lin, Yuli, Wu, Siyuan, Wan, Yao, Sun, Lichao
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2401.17882
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916127714574336
author	Li, Yuan Huang, Yue Lin, Yuli Wu, Siyuan Wan, Yao Sun, Lichao
author_facet	Li, Yuan Huang, Yue Lin, Yuli Wu, Siyuan Wan, Yao Sun, Lichao
contents	Do large language models (LLMs) exhibit any forms of awareness similar to humans? In this paper, we introduce AwareBench, a benchmark designed to evaluate awareness in LLMs. Drawing from theories in psychology and philosophy, we define awareness in LLMs as the ability to understand themselves as AI models and to exhibit social intelligence. Subsequently, we categorize awareness in LLMs into five dimensions, including capability, mission, emotion, culture, and perspective. Based on this taxonomy, we create a dataset called AwareEval, which contains binary, multiple-choice, and open-ended questions to assess LLMs' understandings of specific awareness dimensions. Our experiments, conducted on 13 LLMs, reveal that the majority of them struggle to fully recognize their capabilities and missions while demonstrating decent social intelligence. We conclude by connecting awareness of LLMs with AI alignment and safety, emphasizing its significance to the trustworthy and ethical development of LLMs. Our dataset and code are available at https://github.com/HowieHwong/Awareness-in-LLM.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_17882
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench Li, Yuan Huang, Yue Lin, Yuli Wu, Siyuan Wan, Yao Sun, Lichao Computation and Language Do large language models (LLMs) exhibit any forms of awareness similar to humans? In this paper, we introduce AwareBench, a benchmark designed to evaluate awareness in LLMs. Drawing from theories in psychology and philosophy, we define awareness in LLMs as the ability to understand themselves as AI models and to exhibit social intelligence. Subsequently, we categorize awareness in LLMs into five dimensions, including capability, mission, emotion, culture, and perspective. Based on this taxonomy, we create a dataset called AwareEval, which contains binary, multiple-choice, and open-ended questions to assess LLMs' understandings of specific awareness dimensions. Our experiments, conducted on 13 LLMs, reveal that the majority of them struggle to fully recognize their capabilities and missions while demonstrating decent social intelligence. We conclude by connecting awareness of LLMs with AI alignment and safety, emphasizing its significance to the trustworthy and ethical development of LLMs. Our dataset and code are available at https://github.com/HowieHwong/Awareness-in-LLM.
title	I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench
topic	Computation and Language
url	https://arxiv.org/abs/2401.17882

Similar Items