_version_ 1866911343918972928
author LearnLM Team
Eedi
:
Wang, Albert
Rysbek, Aliya
Huber, Andrea
Nambiar, Anjali
Kenolty, Anna
Caulfield, Ben
Lilley-Draper, Beth
Groot, Bibi
Veprek, Brian
Burdett, Chelsea
Willis, Claire
Barton, Craig
Smith, Digory
Mu, George
Walters, Harriet
Jurenka, Irina
Hulls, Iris
Stalley-Moores, James
Caton, Jonathan
Wilkowski, Julia
Alarakyia, Kaiz
McKee, Kevin R.
McCafferty, Liam
Dalton, Lucy
Kunesch, Markus
Malubay, Pauline
Kidson, Rachel
Wells, Rich
Wheeler, Sam
Wiltberger, Sara
Mohamed, Shakir
Woodhead, Simon
Brazão, Vasco
author_facet LearnLM Team
Eedi
:
Wang, Albert
Rysbek, Aliya
Huber, Andrea
Nambiar, Anjali
Kenolty, Anna
Caulfield, Ben
Lilley-Draper, Beth
Groot, Bibi
Veprek, Brian
Burdett, Chelsea
Willis, Claire
Barton, Craig
Smith, Digory
Mu, George
Walters, Harriet
Jurenka, Irina
Hulls, Iris
Stalley-Moores, James
Caton, Jonathan
Wilkowski, Julia
Alarakyia, Kaiz
McKee, Kevin R.
McCafferty, Liam
Dalton, Lucy
Kunesch, Markus
Malubay, Pauline
Kidson, Rachel
Wells, Rich
Wheeler, Sam
Wiltberger, Sara
Mohamed, Shakir
Woodhead, Simon
Brazão, Vasco
contents One-to-one tutoring is widely considered the gold standard for personalized education, yet it remains prohibitively expensive to scale. To evaluate whether generative AI might help expand access to this resource, we conducted an exploratory randomized controlled trial (RCT) with $N = 165$ students across five UK secondary schools. We integrated LearnLM -- a generative AI model fine-tuned for pedagogy -- into chat-based tutoring sessions on the Eedi mathematics platform. In the RCT, expert tutors directly supervised LearnLM, with the remit to revise each message it drafted until they would be satisfied sending it themselves. LearnLM proved to be a reliable source of pedagogical instruction, with supervising tutors approving 76.4% of its drafted messages making zero or minimal edits (i.e., changing only one or two characters). This translated into effective tutoring support: students guided by LearnLM performed at least as well as students chatting with human tutors on each learning outcome we measured. In fact, students who received support from LearnLM were 5.5 percentage points more likely to solve novel problems on subsequent topics (with a success rate of 66.2%) than those who received tutoring from human tutors alone (rate of 60.7%). In interviews, tutors highlighted LearnLM's strength at drafting Socratic questions that encouraged deeper reflection from students, with multiple tutors even reporting that they learned new pedagogical practices from the model. Overall, our results suggest that pedagogically fine-tuned AI tutoring systems may play a promising role in delivering effective, individualized learning support at scale.
format Preprint
id arxiv_https___arxiv_org_abs_2512_23633
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms
LearnLM Team
Eedi
:
Wang, Albert
Rysbek, Aliya
Huber, Andrea
Nambiar, Anjali
Kenolty, Anna
Caulfield, Ben
Lilley-Draper, Beth
Groot, Bibi
Veprek, Brian
Burdett, Chelsea
Willis, Claire
Barton, Craig
Smith, Digory
Mu, George
Walters, Harriet
Jurenka, Irina
Hulls, Iris
Stalley-Moores, James
Caton, Jonathan
Wilkowski, Julia
Alarakyia, Kaiz
McKee, Kevin R.
McCafferty, Liam
Dalton, Lucy
Kunesch, Markus
Malubay, Pauline
Kidson, Rachel
Wells, Rich
Wheeler, Sam
Wiltberger, Sara
Mohamed, Shakir
Woodhead, Simon
Brazão, Vasco
Computers and Society
Artificial Intelligence
Machine Learning
One-to-one tutoring is widely considered the gold standard for personalized education, yet it remains prohibitively expensive to scale. To evaluate whether generative AI might help expand access to this resource, we conducted an exploratory randomized controlled trial (RCT) with $N = 165$ students across five UK secondary schools. We integrated LearnLM -- a generative AI model fine-tuned for pedagogy -- into chat-based tutoring sessions on the Eedi mathematics platform. In the RCT, expert tutors directly supervised LearnLM, with the remit to revise each message it drafted until they would be satisfied sending it themselves. LearnLM proved to be a reliable source of pedagogical instruction, with supervising tutors approving 76.4% of its drafted messages making zero or minimal edits (i.e., changing only one or two characters). This translated into effective tutoring support: students guided by LearnLM performed at least as well as students chatting with human tutors on each learning outcome we measured. In fact, students who received support from LearnLM were 5.5 percentage points more likely to solve novel problems on subsequent topics (with a success rate of 66.2%) than those who received tutoring from human tutors alone (rate of 60.7%). In interviews, tutors highlighted LearnLM's strength at drafting Socratic questions that encouraged deeper reflection from students, with multiple tutors even reporting that they learned new pedagogical practices from the model. Overall, our results suggest that pedagogically fine-tuned AI tutoring systems may play a promising role in delivering effective, individualized learning support at scale.
title AI tutoring can safely and effectively support students: An exploratory RCT in UK classrooms
topic Computers and Society
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2512.23633