Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Johnson-Yu, Sonja, Bowman, Nicholas, Sahami, Mehran, Piech, Chris
Format:	Preprint
Published:	2024
Subjects:	Computers and Society
Online Access:	https://arxiv.org/abs/2403.14637
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910377887924224
author	Johnson-Yu, Sonja Bowman, Nicholas Sahami, Mehran Piech, Chris
author_facet	Johnson-Yu, Sonja Bowman, Nicholas Sahami, Mehran Piech, Chris
contents	While the use of programming problems on exams is a common form of summative assessment in CS courses, grading such exam problems can be a difficult and inconsistent process. Through an analysis of historical grading patterns we show that inaccurate and inconsistent grading of free-response programming problems is widespread in CS1 courses. These inconsistencies necessitate the development of methods to ensure more fairer and more accurate grading. In subsequent analysis of this historical exam data we demonstrate that graders are able to more accurately assign a score to a student submission when they have previously seen another submission similar to it. As a result, we hypothesize that we can improve exam grading accuracy by ensuring that each submission that a grader sees is similar to at least one submission they have previously seen. We propose several algorithms for (1) assigning student submissions to graders, and (2) ordering submissions to maximize the probability that a grader has previously seen a similar solution, leveraging distributed representations of student code in order to measure similarity between submissions. Finally, we demonstrate in simulation that these algorithms achieve higher grading accuracy than the current standard random assignment process used for grading.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_14637
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SimGrade: Using Code Similarity Measures for More Accurate Human Grading Johnson-Yu, Sonja Bowman, Nicholas Sahami, Mehran Piech, Chris Computers and Society While the use of programming problems on exams is a common form of summative assessment in CS courses, grading such exam problems can be a difficult and inconsistent process. Through an analysis of historical grading patterns we show that inaccurate and inconsistent grading of free-response programming problems is widespread in CS1 courses. These inconsistencies necessitate the development of methods to ensure more fairer and more accurate grading. In subsequent analysis of this historical exam data we demonstrate that graders are able to more accurately assign a score to a student submission when they have previously seen another submission similar to it. As a result, we hypothesize that we can improve exam grading accuracy by ensuring that each submission that a grader sees is similar to at least one submission they have previously seen. We propose several algorithms for (1) assigning student submissions to graders, and (2) ordering submissions to maximize the probability that a grader has previously seen a similar solution, leveraging distributed representations of student code in order to measure similarity between submissions. Finally, we demonstrate in simulation that these algorithms achieve higher grading accuracy than the current standard random assignment process used for grading.
title	SimGrade: Using Code Similarity Measures for More Accurate Human Grading
topic	Computers and Society
url	https://arxiv.org/abs/2403.14637

Similar Items