Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Qiu, Yiding, Azimi, Seyed Mahdi, Lensky, Artem
Format:	Preprint
Published:	2026
Subjects:	Software Engineering
Online Access:	https://arxiv.org/abs/2601.10093
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Large programming courses struggle to provide timely, detailed feedback on student code. We developed Mark My Works, a local autograding system that combines traditional unit testing with LLM-generated explanations. The system uses role-based prompts to analyze submissions, critique code quality, and generate pedagogical feedback while maintaining transparency in its reasoning process. We piloted the system in a 191-student engineering course, comparing AI-generated assessments with human grading on 79 submissions. While AI scores showed no linear correlation with human scores (r = -0.177, p = 0.124), both systems exhibited similar left-skewed distributions, suggesting they recognize comparable quality hierarchies despite different scoring philosophies. The AI system demonstrated more conservative scoring (mean: 59.95 vs 80.53 human) but generated significantly more detailed technical feedback.

Similar Items