Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Almeida, Guilherme F. C. F., Nunes, José Luiz, Engelmann, Neele, Wiegmann, Alex, de Araújo, Marcelo
Format:	Preprint
Published:	2023
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2308.01264
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914804196704256
author	Almeida, Guilherme F. C. F. Nunes, José Luiz Engelmann, Neele Wiegmann, Alex de Araújo, Marcelo
author_facet	Almeida, Guilherme F. C. F. Nunes, José Luiz Engelmann, Neele Wiegmann, Alex de Araújo, Marcelo
contents	Large language models (LLMs) exhibit expert-level performance in tasks across a wide range of different domains. Ethical issues raised by LLMs and the need to align future versions makes it important to know how state of the art models reason about moral and legal issues. In this paper, we employ the methods of experimental psychology to probe into this question. We replicate eight studies from the experimental literature with instances of Google's Gemini Pro, Anthropic's Claude 2.1, OpenAI's GPT-4, and Meta's Llama 2 Chat 70b. We find that alignment with human responses shifts from one experiment to another, and that models differ amongst themselves as to their overall alignment, with GPT-4 taking a clear lead over all other models we tested. Nonetheless, even when LLM-generated responses are highly correlated to human responses, there are still systematic differences, with a tendency for models to exaggerate effects that are present among humans, in part by reducing variance. This recommends caution with regards to proposals of replacing human participants with current state-of-the-art LLMs in psychological research and highlights the need for further research about the distinctive aspects of machine psychology.
format	Preprint
id	arxiv_https___arxiv_org_abs_2308_01264
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Exploring the psychology of LLMs' Moral and Legal Reasoning Almeida, Guilherme F. C. F. Nunes, José Luiz Engelmann, Neele Wiegmann, Alex de Araújo, Marcelo Artificial Intelligence Computation and Language Large language models (LLMs) exhibit expert-level performance in tasks across a wide range of different domains. Ethical issues raised by LLMs and the need to align future versions makes it important to know how state of the art models reason about moral and legal issues. In this paper, we employ the methods of experimental psychology to probe into this question. We replicate eight studies from the experimental literature with instances of Google's Gemini Pro, Anthropic's Claude 2.1, OpenAI's GPT-4, and Meta's Llama 2 Chat 70b. We find that alignment with human responses shifts from one experiment to another, and that models differ amongst themselves as to their overall alignment, with GPT-4 taking a clear lead over all other models we tested. Nonetheless, even when LLM-generated responses are highly correlated to human responses, there are still systematic differences, with a tendency for models to exaggerate effects that are present among humans, in part by reducing variance. This recommends caution with regards to proposals of replacing human participants with current state-of-the-art LLMs in psychological research and highlights the need for further research about the distinctive aspects of machine psychology.
title	Exploring the psychology of LLMs' Moral and Legal Reasoning
topic	Artificial Intelligence Computation and Language
url	https://arxiv.org/abs/2308.01264

Similar Items