Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Jones, Cameron R., Bergen, Benjamin K.
Format:	Preprint
Published:	2024
Subjects:	Human-Computer Interaction Artificial Intelligence
Online Access:	https://arxiv.org/abs/2405.08007
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929342103158784
author	Jones, Cameron R. Bergen, Benjamin K.
author_facet	Jones, Cameron R. Bergen, Benjamin K.
contents	We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first robust empirical demonstration that any artificial system passes an interactive 2-player Turing test. The results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected. Analysis of participants' strategies and reasoning suggests that stylistic and socio-emotional factors play a larger role in passing the Turing test than traditional notions of intelligence.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_08007
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	People cannot distinguish GPT-4 from a human in a Turing test Jones, Cameron R. Bergen, Benjamin K. Human-Computer Interaction Artificial Intelligence We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first robust empirical demonstration that any artificial system passes an interactive 2-player Turing test. The results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected. Analysis of participants' strategies and reasoning suggests that stylistic and socio-emotional factors play a larger role in passing the Turing test than traditional notions of intelligence.
title	People cannot distinguish GPT-4 from a human in a Turing test
topic	Human-Computer Interaction Artificial Intelligence
url	https://arxiv.org/abs/2405.08007

Similar Items