Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	McClure, Jeanne, Shimmei, Machi, Matsuda, Noboru, Jiang, Shiyan
Format:	Preprint
Published:	2024
Subjects:	Computers and Society Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2407.01551
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909238054354944
author	McClure, Jeanne Shimmei, Machi Matsuda, Noboru Jiang, Shiyan
author_facet	McClure, Jeanne Shimmei, Machi Matsuda, Noboru Jiang, Shiyan
contents	In this paper, we explore the potential of Large Language Models (LLMs) with assertions to mitigate imbalances in educational datasets. Traditional models often fall short in such contexts, particularly due to the complexity and nuanced nature of the data. This issue is especially prominent in the education sector, where cognitive engagement levels among students show significant variation in their open responses. To test our hypothesis, we utilized an existing technology for assertion-based prompt engineering through an 'Iterative - ICL PE Design Process' comparing traditional Machine Learning (ML) models against LLMs augmented with assertions (N=135). Further, we conduct a sensitivity analysis on a subset (n=27), examining the variance in model performance concerning classification metrics and cognitive engagement levels in each iteration. Our findings reveal that LLMs with assertions significantly outperform traditional ML models, particularly in cognitive engagement levels with minority representation, registering up to a 32% increase in F1-score. Additionally, our sensitivity study indicates that incorporating targeted assertions into the LLM tested on the subset enhances its performance by 11.94%. This improvement primarily addresses errors stemming from the model's limitations in understanding context and resolving lexical ambiguities in student responses.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_01551
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Leveraging Prompts in LLMs to Overcome Imbalances in Complex Educational Text Data McClure, Jeanne Shimmei, Machi Matsuda, Noboru Jiang, Shiyan Computers and Society Artificial Intelligence Computation and Language In this paper, we explore the potential of Large Language Models (LLMs) with assertions to mitigate imbalances in educational datasets. Traditional models often fall short in such contexts, particularly due to the complexity and nuanced nature of the data. This issue is especially prominent in the education sector, where cognitive engagement levels among students show significant variation in their open responses. To test our hypothesis, we utilized an existing technology for assertion-based prompt engineering through an 'Iterative - ICL PE Design Process' comparing traditional Machine Learning (ML) models against LLMs augmented with assertions (N=135). Further, we conduct a sensitivity analysis on a subset (n=27), examining the variance in model performance concerning classification metrics and cognitive engagement levels in each iteration. Our findings reveal that LLMs with assertions significantly outperform traditional ML models, particularly in cognitive engagement levels with minority representation, registering up to a 32% increase in F1-score. Additionally, our sensitivity study indicates that incorporating targeted assertions into the LLM tested on the subset enhances its performance by 11.94%. This improvement primarily addresses errors stemming from the model's limitations in understanding context and resolving lexical ambiguities in student responses.
title	Leveraging Prompts in LLMs to Overcome Imbalances in Complex Educational Text Data
topic	Computers and Society Artificial Intelligence Computation and Language
url	https://arxiv.org/abs/2407.01551

Similar Items