Saved in:
Bibliographic Details
Main Author: Salsman, Jim
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.26428
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910268633645056
author Salsman, Jim
author_facet Salsman, Jim
contents Generating high-quality, pedagogically useful questions from lecture slide decks is difficult because important instructional content is distributed across both text and visual elements, and because useful questions must be scaffolded across the flow of a presentation rather than generated slide by slide in isolation. This paper describes Slide Deck Q\&A Quality Assurance (slidesqaqa), a Flask-based software system that extracts text and rendered images from PDF slides and processes them through a four-stage large language model pipeline comprising window planning, deck synthesis, slide annotation, and reconciliation. The system reasons jointly about slide modality and pedagogical role, allocates bounded question budgets, and revises draft annotations at the deck level to reduce redundancy and improve coverage. The final output is a structured JSON annotation containing deck-level goals, section structure, slide-level summaries, question sets, and evaluation scores. Initial experiments on two technical lecture decks indicate that the pipeline can filter non-instructional slides and produce high-fidelity, pedagogically coherent questions for visually complex content. The working system is at https://slidesqaqa-974767694043.us-west1.run.app The software repository is at https://github.com/blinding2submit/slidesqaqa
format Preprint
id arxiv_https___arxiv_org_abs_2605_26428
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Slide Deck Q&A Quality Assurance App: A Multi-Stage Pipeline for Pedagogical Question Generation
Salsman, Jim
Computation and Language
Human-Computer Interaction
68T50
K.3.1; D.2.2
Generating high-quality, pedagogically useful questions from lecture slide decks is difficult because important instructional content is distributed across both text and visual elements, and because useful questions must be scaffolded across the flow of a presentation rather than generated slide by slide in isolation. This paper describes Slide Deck Q\&A Quality Assurance (slidesqaqa), a Flask-based software system that extracts text and rendered images from PDF slides and processes them through a four-stage large language model pipeline comprising window planning, deck synthesis, slide annotation, and reconciliation. The system reasons jointly about slide modality and pedagogical role, allocates bounded question budgets, and revises draft annotations at the deck level to reduce redundancy and improve coverage. The final output is a structured JSON annotation containing deck-level goals, section structure, slide-level summaries, question sets, and evaluation scores. Initial experiments on two technical lecture decks indicate that the pipeline can filter non-instructional slides and produce high-fidelity, pedagogically coherent questions for visually complex content. The working system is at https://slidesqaqa-974767694043.us-west1.run.app The software repository is at https://github.com/blinding2submit/slidesqaqa
title Slide Deck Q&A Quality Assurance App: A Multi-Stage Pipeline for Pedagogical Question Generation
topic Computation and Language
Human-Computer Interaction
68T50
K.3.1; D.2.2
url https://arxiv.org/abs/2605.26428