Saved in:
Bibliographic Details
Main Authors: Rad, Mohammad Heydari, Afari, Rezvan, Momtazi, Saeedeh
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.15134
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Multiple-choice questions (MCQs) are commonly used in educational testing, as they offer an efficient means of evaluating learners' knowledge. However, generating high-quality MCQs, particularly in low-resource languages such as Persian, remains a significant challenge. This paper introduces FarsiMCQGen, an innovative approach for generating Persian-language MCQs. Our methodology combines candidate generation, filtering, and ranking techniques to build a model that generates answer choices resembling those in real MCQs. We leverage advanced methods, including Transformers and knowledge graphs, integrated with rule-based approaches to craft credible distractors that challenge test-takers. Our work is based on data from Wikipedia, which includes general knowledge questions. Furthermore, this study introduces a novel Persian MCQ dataset comprising 10,289 questions. This dataset is evaluated by different state-of-the-art large language models (LLMs). Our results demonstrate the effectiveness of our model and the quality of the generated dataset, which has the potential to inspire further research on MCQs.