Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.10093 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- Large programming courses struggle to provide timely, detailed feedback on student code. We developed Mark My Works, a local autograding system that combines traditional unit testing with LLM-generated explanations. The system uses role-based prompts to analyze submissions, critique code quality, and generate pedagogical feedback while maintaining transparency in its reasoning process. We piloted the system in a 191-student engineering course, comparing AI-generated assessments with human grading on 79 submissions. While AI scores showed no linear correlation with human scores (r = -0.177, p = 0.124), both systems exhibited similar left-skewed distributions, suggesting they recognize comparable quality hierarchies despite different scoring philosophies. The AI system demonstrated more conservative scoring (mean: 59.95 vs 80.53 human) but generated significantly more detailed technical feedback.