Saved in:
Bibliographic Details
Main Authors: Triplett, Steven, Minami, Simon, Verma, Rakesh
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.14814
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916445570465792
author Triplett, Steven
Minami, Simon
Verma, Rakesh
author_facet Triplett, Steven
Minami, Simon
Verma, Rakesh
contents In the modern age an enormous amount of communication occurs online, and it is difficult to know when something written is genuine or deceitful. There are many reasons for someone to deceive online (e.g., monetary gain, political gain) and detecting this behavior without any physical interaction is a difficult task. Additionally, deception occurs in several text-only domains and it is unclear if these various sources can be leveraged to improve detection. To address this, eight datasets were utilized from various domains to evaluate their effect on classifier performance when combined with transfer learning via intermediate layer concatenation of fine-tuned BERT models. We find improvements in accuracy over the baseline. Furthermore, we evaluate multiple distance measurements between datasets and find that Jensen-Shannon distance correlates moderately with transfer learning performance. Finally, the impact was evaluated of multiple methods, which produce additional information in a dataset's text via named entities, on BERT performance and we find notable improvement in accuracy of up to 11.2%.
format Preprint
id arxiv_https___arxiv_org_abs_2410_14814
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Effects of Soft-Domain Transfer and Named Entity Information on Deception Detection
Triplett, Steven
Minami, Simon
Verma, Rakesh
Computation and Language
Machine Learning
In the modern age an enormous amount of communication occurs online, and it is difficult to know when something written is genuine or deceitful. There are many reasons for someone to deceive online (e.g., monetary gain, political gain) and detecting this behavior without any physical interaction is a difficult task. Additionally, deception occurs in several text-only domains and it is unclear if these various sources can be leveraged to improve detection. To address this, eight datasets were utilized from various domains to evaluate their effect on classifier performance when combined with transfer learning via intermediate layer concatenation of fine-tuned BERT models. We find improvements in accuracy over the baseline. Furthermore, we evaluate multiple distance measurements between datasets and find that Jensen-Shannon distance correlates moderately with transfer learning performance. Finally, the impact was evaluated of multiple methods, which produce additional information in a dataset's text via named entities, on BERT performance and we find notable improvement in accuracy of up to 11.2%.
title Effects of Soft-Domain Transfer and Named Entity Information on Deception Detection
topic Computation and Language
Machine Learning
url https://arxiv.org/abs/2410.14814