Saved in:
Bibliographic Details
Main Authors: Almeida, Leonardo, Rodrigues, Pedro, Magalhães, Diogo, Pinho, Armando J., Pratas, Diogo
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2411.19869
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • This paper introduces AIDetx, a novel method for detecting machine-generated text using data compression techniques. Traditional approaches, such as deep learning classifiers, often suffer from high computational costs and limited interpretability. To address these limitations, we propose a compression-based classification framework that leverages finite-context models (FCMs). AIDetx constructs distinct compression models for human-written and AI-generated text, classifying new inputs based on which model achieves a higher compression ratio. We evaluated AIDetx on two benchmark datasets, achieving F1 scores exceeding 97% and 99%, respectively, highlighting its high accuracy. Compared to current methods, such as large language models (LLMs), AIDetx offers a more interpretable and computationally efficient solution, significantly reducing both training time and hardware requirements (e.g., no GPUs needed). The full implementation is publicly available at https://github.com/AIDetx/AIDetx.