Saved in:
Bibliographic Details
Main Authors: van der Veen, Olaf, Dzebo, Semir, Littvay, Levi, Hawkins, Kirk, Dar, Oren
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.15213
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909298370543616
author van der Veen, Olaf
Dzebo, Semir
Littvay, Levi
Hawkins, Kirk
Dar, Oren
author_facet van der Veen, Olaf
Dzebo, Semir
Littvay, Levi
Hawkins, Kirk
Dar, Oren
contents Populism is a concept that is often used but notoriously difficult to measure. Common qualitative measurements like holistic grading or content analysis require great amounts of time and labour, making it difficult to quickly scope out which politicians should be classified as populist and which should not, while quantitative methods show mixed results when it comes to classifying populist rhetoric. In this paper, we develop a pipeline to train and validate an automated classification model to estimate the use of populist language. We train models based on sentences that were identified as populist and pluralist in 300 US governors' speeches from 2010 to 2018 and in 45 speeches of presidential candidates in 2016. We find that these models classify most speeches correctly, including 84% of governor speeches and 89% of presidential speeches. These results extend to different time periods (with 92% accuracy on more recent American governors), different amounts of data (with as few as 70 training sentences per category achieving similar results), and when classifying politicians instead of individual speeches. This pipeline is thus an effective tool that can optimise the systematic and swift classification of the use of populist language in politicians' speeches.
format Preprint
id arxiv_https___arxiv_org_abs_2408_15213
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Classifying populist language in American presidential and governor speeches using automatic text analysis
van der Veen, Olaf
Dzebo, Semir
Littvay, Levi
Hawkins, Kirk
Dar, Oren
Computation and Language
Populism is a concept that is often used but notoriously difficult to measure. Common qualitative measurements like holistic grading or content analysis require great amounts of time and labour, making it difficult to quickly scope out which politicians should be classified as populist and which should not, while quantitative methods show mixed results when it comes to classifying populist rhetoric. In this paper, we develop a pipeline to train and validate an automated classification model to estimate the use of populist language. We train models based on sentences that were identified as populist and pluralist in 300 US governors' speeches from 2010 to 2018 and in 45 speeches of presidential candidates in 2016. We find that these models classify most speeches correctly, including 84% of governor speeches and 89% of presidential speeches. These results extend to different time periods (with 92% accuracy on more recent American governors), different amounts of data (with as few as 70 training sentences per category achieving similar results), and when classifying politicians instead of individual speeches. This pipeline is thus an effective tool that can optimise the systematic and swift classification of the use of populist language in politicians' speeches.
title Classifying populist language in American presidential and governor speeches using automatic text analysis
topic Computation and Language
url https://arxiv.org/abs/2408.15213