Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	De Paoli, Stefano, Mathis, Walter Stan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Computers and Society
Online Access:	https://arxiv.org/abs/2401.03239
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916082036506624
author	De Paoli, Stefano Mathis, Walter Stan
author_facet	De Paoli, Stefano Mathis, Walter Stan
contents	This paper presents a set of reflections on saturation and the use of Large Language Models (LLMs) for performing Thematic Analysis (TA). The paper suggests that initial thematic saturation (ITS) could be used as a metric to assess part of the transactional validity of TA with LLM, focusing on the initial coding. The paper presents the initial coding of two datasets of different sizes, and it reflects on how the LLM reaches some form of analytical saturation during the coding. The procedure proposed in this work leads to the creation of two codebooks, one comprising the total cumulative initial codes and the other the total unique codes. The paper proposes a metric to synthetically measure ITS using a simple mathematical calculation employing the ratio between slopes of cumulative codes and unique codes. The paper contributes to the initial body of work exploring how to perform qualitative analysis with LLMs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_03239
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Reflections on Inductive Thematic Saturation as a potential metric for measuring the validity of an inductive Thematic Analysis with LLMs De Paoli, Stefano Mathis, Walter Stan Computation and Language Computers and Society This paper presents a set of reflections on saturation and the use of Large Language Models (LLMs) for performing Thematic Analysis (TA). The paper suggests that initial thematic saturation (ITS) could be used as a metric to assess part of the transactional validity of TA with LLM, focusing on the initial coding. The paper presents the initial coding of two datasets of different sizes, and it reflects on how the LLM reaches some form of analytical saturation during the coding. The procedure proposed in this work leads to the creation of two codebooks, one comprising the total cumulative initial codes and the other the total unique codes. The paper proposes a metric to synthetically measure ITS using a simple mathematical calculation employing the ratio between slopes of cumulative codes and unique codes. The paper contributes to the initial body of work exploring how to perform qualitative analysis with LLMs.
title	Reflections on Inductive Thematic Saturation as a potential metric for measuring the validity of an inductive Thematic Analysis with LLMs
topic	Computation and Language Computers and Society
url	https://arxiv.org/abs/2401.03239

Similar Items