Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	McCutcheon, Austin, Brogly, Chris
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Information Retrieval
Online Access:	https://arxiv.org/abs/2509.00680
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912562526814208
author	McCutcheon, Austin Brogly, Chris
author_facet	McCutcheon, Austin Brogly, Chris
contents	Small language models (SLMs) have the capability for text generation and may potentially be used to generate falsified texts online. This study evaluates 14 SLMs (1.7B-14B parameters) including LLaMA, Gemma, Phi, SmolLM, Mistral, and Granite families in generating perceived low and high quality fake news headlines when explicitly prompted, and whether they appear to be similar to real-world news headlines. Using controlled prompt engineering, 24,000 headlines were generated across low-quality and high-quality deceptive categories. Existing machine learning and deep learning-based news headline quality detectors were then applied against these SLM-generated fake news headlines. SLMs demonstrated high compliance rates with minimal ethical resistance, though there were some occasional exceptions. Headline quality detection using established DistilBERT and bagging classifier models showed that quality misclassification was common, with detection accuracies only ranging from 35.2% to 63.5%. These findings suggest the following: tested SLMs generally are compliant in generating falsified headlines, although there are slight variations in ethical restraints, and the generated headlines did not closely resemble existing primarily human-written content on the web, given the low quality classification accuracy.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_00680
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Do small language models generate realistic variable-quality fake news headlines? McCutcheon, Austin Brogly, Chris Computation and Language Information Retrieval Small language models (SLMs) have the capability for text generation and may potentially be used to generate falsified texts online. This study evaluates 14 SLMs (1.7B-14B parameters) including LLaMA, Gemma, Phi, SmolLM, Mistral, and Granite families in generating perceived low and high quality fake news headlines when explicitly prompted, and whether they appear to be similar to real-world news headlines. Using controlled prompt engineering, 24,000 headlines were generated across low-quality and high-quality deceptive categories. Existing machine learning and deep learning-based news headline quality detectors were then applied against these SLM-generated fake news headlines. SLMs demonstrated high compliance rates with minimal ethical resistance, though there were some occasional exceptions. Headline quality detection using established DistilBERT and bagging classifier models showed that quality misclassification was common, with detection accuracies only ranging from 35.2% to 63.5%. These findings suggest the following: tested SLMs generally are compliant in generating falsified headlines, although there are slight variations in ethical restraints, and the generated headlines did not closely resemble existing primarily human-written content on the web, given the low quality classification accuracy.
title	Do small language models generate realistic variable-quality fake news headlines?
topic	Computation and Language Information Retrieval
url	https://arxiv.org/abs/2509.00680

Similar Items