Saved in:
Bibliographic Details
Main Authors: Lee, Patrick Yung Kang, Bucci, Paul Hendrik, Foord-Kelcey, Leo Itsuki, Singh, Alamjeet, Beschastnikh, Ivan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.06124
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Large text corpora, such as Reddit posts, have become an increasingly prevalent site of qualitative inquiry. However, most large text corpora are intractable for qualitative researchers. Instead, teams rely on statistical subsampling to reduce corpora to a manageable size for qualitative analysis. While previous work for navigating large corpora involves visualizing the dataset at the corpus-level using high-level statistical summaries, few systems offer the ability to curate data using an interpretivist approach. To address this, we developed Teleoscope, a web-based interface designed to scaffold iterative, interactive, and reflexive refinement of a large corpus, in a process we call thematic curation. Across three deployments, we learned that Teleoscope supports serendipitous discovery of new keywords, results in greater feelings of confidence in search saturation, and aids collaborative discussion of alternative curation pathways. Teleoscope empowers researchers to stay "close to the data" in order to make qualitative workflows methodologically coherent with large text corpora.