Saved in:
Bibliographic Details
Main Author: Schmal, W. Benedikt
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.18499
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929330088574976
author Schmal, W. Benedikt
author_facet Schmal, W. Benedikt
contents Natural language processing tools have become frequently used in social sciences such as economics, political science, and sociology. Many publications apply topic modeling to elicit latent topics in text corpora and their development over time. Here, most publications rely on visual inspections and draw inference on changes, structural breaks, and developments over time. We suggest using univariate time series econometrics to introduce more quantitative rigor that can strengthen the analyses. In particular, we discuss the econometric topics of non-stationarity as well as structural breaks. This paper serves as a comprehensive practitioners guide to provide researchers in the social and life sciences as well as the humanities with concise advice on how to implement econometric time series methods to thoroughly investigate topic prevalences over time. We provide coding advice for the statistical software R throughout the paper. The application of the discussed tools to a sample dataset completes the analysis.
format Preprint
id arxiv_https___arxiv_org_abs_2404_18499
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide
Schmal, W. Benedikt
General Economics
Economics
Natural language processing tools have become frequently used in social sciences such as economics, political science, and sociology. Many publications apply topic modeling to elicit latent topics in text corpora and their development over time. Here, most publications rely on visual inspections and draw inference on changes, structural breaks, and developments over time. We suggest using univariate time series econometrics to introduce more quantitative rigor that can strengthen the analyses. In particular, we discuss the econometric topics of non-stationarity as well as structural breaks. This paper serves as a comprehensive practitioners guide to provide researchers in the social and life sciences as well as the humanities with concise advice on how to implement econometric time series methods to thoroughly investigate topic prevalences over time. We provide coding advice for the statistical software R throughout the paper. The application of the discussed tools to a sample dataset completes the analysis.
title Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide
topic General Economics
Economics
url https://arxiv.org/abs/2404.18499