Table of Contents: :: Library Catalog

$Cover Image$

Saved in:

Bibliographic Details
Main Author:	Davis, Ernest
Format:	Preprint
Published:	2024
Subjects:	Computers and Society Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.22340
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

In August 2023, Scott Aaronson and I reported the results of testing GPT4 with the Wolfram Alpha and Code Interpreter plug-ins over a collection of 105 original high-school level and college-level science and math problems (Davis and Aaronson, 2023). In September 2024, I tested the recently released model GPT-4o1-preview on the same collection. Overall I found that performance had significantly improved, but was still considerably short of perfect. In particular, problems that involve spatial reasoning are often stumbling blocks.

Similar Items