Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shangguan, Zhongkai, Huang, Zanming, Ohn-Bar, Eshed, Ozernov-Palchik, Ola, Kosty, Derek, Stoolmiller, Michael, Fien, Hank
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2412.10401
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866912155747483648
author Shangguan, Zhongkai
Huang, Zanming
Ohn-Bar, Eshed
Ozernov-Palchik, Ola
Kosty, Derek
Stoolmiller, Michael
Fien, Hank
author_facet Shangguan, Zhongkai
Huang, Zanming
Ohn-Bar, Eshed
Ozernov-Palchik, Ola
Kosty, Derek
Stoolmiller, Michael
Fien, Hank
contents Models for student reading performance can empower educators and institutions to proactively identify at-risk students, thereby enabling early and tailored instructional interventions. However, there are no suitable publicly available educational datasets for modeling and predicting future reading performance. In this work, we introduce the Enhanced Core Reading Instruction ECRI dataset, a novel large-scale longitudinal tabular dataset collected across 44 schools with 6,916 students and 172 teachers. We leverage the dataset to empirically evaluate the ability of state-of-the-art machine learning models to recognize early childhood educational patterns in multivariate and partial measurements. Specifically, we demonstrate a simple self-supervised strategy in which a Multi-Layer Perception (MLP) network is pre-trained over masked inputs to outperform several strong baselines while generalizing over diverse educational settings. To facilitate future developments in precise modeling and responsible use of models for individualized and early intervention strategies, our data and code are available at https://ecri-data.github.io/.
format Preprint
id arxiv_https___arxiv_org_abs_2412_10401
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Scalable Early Childhood Reading Performance Prediction
Shangguan, Zhongkai
Huang, Zanming
Ohn-Bar, Eshed
Ozernov-Palchik, Ola
Kosty, Derek
Stoolmiller, Michael
Fien, Hank
Machine Learning
Models for student reading performance can empower educators and institutions to proactively identify at-risk students, thereby enabling early and tailored instructional interventions. However, there are no suitable publicly available educational datasets for modeling and predicting future reading performance. In this work, we introduce the Enhanced Core Reading Instruction ECRI dataset, a novel large-scale longitudinal tabular dataset collected across 44 schools with 6,916 students and 172 teachers. We leverage the dataset to empirically evaluate the ability of state-of-the-art machine learning models to recognize early childhood educational patterns in multivariate and partial measurements. Specifically, we demonstrate a simple self-supervised strategy in which a Multi-Layer Perception (MLP) network is pre-trained over masked inputs to outperform several strong baselines while generalizing over diverse educational settings. To facilitate future developments in precise modeling and responsible use of models for individualized and early intervention strategies, our data and code are available at https://ecri-data.github.io/.
title Scalable Early Childhood Reading Performance Prediction
topic Machine Learning
url https://arxiv.org/abs/2412.10401