Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Pope, Nicolas, Tedre, Matti
Format:	Preprint
Published:	2026
Subjects:	Computers and Society
Online Access:	https://arxiv.org/abs/2601.21631
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917232155557888
author	Pope, Nicolas Tedre, Matti
author_facet	Pope, Nicolas Tedre, Matti
contents	Most classroom engagements with generative AI focus on prompting pre-trained models, leaving the role of training data and model mechanics opaque. We developed a browser-based tool that allows students to train a small transformer language model entirely on their own device, making the training process visible. In a CS1 course, 162 students completed pre- and post-test explanations of why language models sometimes produce incorrect or strange output. After a brief hands-on training activity, students' explanations shifted significantly from anthropomorphic and misconceived accounts toward data- and model-based reasoning. The results suggest that enabling learners to directly observe training can support conceptual understanding of the data-driven nature of language models and model training, even within a short intervention. For K-12 AI literacy and AI education research, the study findings suggest that enabling students to train - and not only prompt - language models can shift how they think about AI.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_21631
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Turning Language Model Training from Black Box into a Sandbox Pope, Nicolas Tedre, Matti Computers and Society Most classroom engagements with generative AI focus on prompting pre-trained models, leaving the role of training data and model mechanics opaque. We developed a browser-based tool that allows students to train a small transformer language model entirely on their own device, making the training process visible. In a CS1 course, 162 students completed pre- and post-test explanations of why language models sometimes produce incorrect or strange output. After a brief hands-on training activity, students' explanations shifted significantly from anthropomorphic and misconceived accounts toward data- and model-based reasoning. The results suggest that enabling learners to directly observe training can support conceptual understanding of the data-driven nature of language models and model training, even within a short intervention. For K-12 AI literacy and AI education research, the study findings suggest that enabling students to train - and not only prompt - language models can shift how they think about AI.
title	Turning Language Model Training from Black Box into a Sandbox
topic	Computers and Society
url	https://arxiv.org/abs/2601.21631

Similar Items