Saved in:
Bibliographic Details
Main Authors: Nguyen, Trieu Hai, Akilesh, Sivaswamy
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.26189
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914068087963648
author Nguyen, Trieu Hai
Akilesh, Sivaswamy
author_facet Nguyen, Trieu Hai
Akilesh, Sivaswamy
contents The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.
format Preprint
id arxiv_https___arxiv_org_abs_2509_26189
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text
Nguyen, Trieu Hai
Akilesh, Sivaswamy
Computation and Language
The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.
title VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text
topic Computation and Language
url https://arxiv.org/abs/2509.26189