Saved in:
Bibliographic Details
Main Authors: Giorgi, Salvatore, Liu, Tingting, Aich, Ankit, Isman, Kelsey, Sherman, Garrick, Fried, Zachary, Sedoc, João, Ungar, Lyle H., Curtis, Brenda
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.14462
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912075393007616
author Giorgi, Salvatore
Liu, Tingting
Aich, Ankit
Isman, Kelsey
Sherman, Garrick
Fried, Zachary
Sedoc, João
Ungar, Lyle H.
Curtis, Brenda
author_facet Giorgi, Salvatore
Liu, Tingting
Aich, Ankit
Isman, Kelsey
Sherman, Garrick
Fried, Zachary
Sedoc, João
Ungar, Lyle H.
Curtis, Brenda
contents Large language models (LLMs) are increasingly being used in human-centered social scientific tasks, such as data annotation, synthetic data creation, and engaging in dialog. However, these tasks are highly subjective and dependent on human factors, such as one's environment, attitudes, beliefs, and lived experiences. Thus, it may be the case that employing LLMs (which do not have such human factors) in these tasks results in a lack of variation in data, failing to reflect the diversity of human experiences. In this paper, we examine the role of prompting LLMs with human-like personas and asking the models to answer as if they were a specific human. This is done explicitly, with exact demographics, political beliefs, and lived experiences, or implicitly via names prevalent in specific populations. The LLM personas are then evaluated via (1) subjective annotation task (e.g., detecting toxicity) and (2) a belief generation task, where both tasks are known to vary across human factors. We examine the impact of explicit vs. implicit personas and investigate which human factors LLMs recognize and respond to. Results show that explicit LLM personas show mixed results when reproducing known human biases, but generally fail to demonstrate implicit biases. We conclude that LLMs may capture the statistical patterns of how people speak, but are generally unable to model the complex interactions and subtleties of human perceptions, potentially limiting their effectiveness in social science applications.
format Preprint
id arxiv_https___arxiv_org_abs_2406_14462
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in Personas
Giorgi, Salvatore
Liu, Tingting
Aich, Ankit
Isman, Kelsey
Sherman, Garrick
Fried, Zachary
Sedoc, João
Ungar, Lyle H.
Curtis, Brenda
Computation and Language
Large language models (LLMs) are increasingly being used in human-centered social scientific tasks, such as data annotation, synthetic data creation, and engaging in dialog. However, these tasks are highly subjective and dependent on human factors, such as one's environment, attitudes, beliefs, and lived experiences. Thus, it may be the case that employing LLMs (which do not have such human factors) in these tasks results in a lack of variation in data, failing to reflect the diversity of human experiences. In this paper, we examine the role of prompting LLMs with human-like personas and asking the models to answer as if they were a specific human. This is done explicitly, with exact demographics, political beliefs, and lived experiences, or implicitly via names prevalent in specific populations. The LLM personas are then evaluated via (1) subjective annotation task (e.g., detecting toxicity) and (2) a belief generation task, where both tasks are known to vary across human factors. We examine the impact of explicit vs. implicit personas and investigate which human factors LLMs recognize and respond to. Results show that explicit LLM personas show mixed results when reproducing known human biases, but generally fail to demonstrate implicit biases. We conclude that LLMs may capture the statistical patterns of how people speak, but are generally unable to model the complex interactions and subtleties of human perceptions, potentially limiting their effectiveness in social science applications.
title Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in Personas
topic Computation and Language
url https://arxiv.org/abs/2406.14462