Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Jung, Minseok, Panizo, Cynthia Fuertes, Dugan, Liam, R., Yi, Fung, Chen, Pin-Yu, Liang, Paul Pu
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2502.04528
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911415059611648
author	Jung, Minseok Panizo, Cynthia Fuertes Dugan, Liam R., Yi Fung Chen, Pin-Yu Liang, Paul Pu
author_facet	Jung, Minseok Panizo, Cynthia Fuertes Dugan, Liam R., Yi Fung Chen, Pin-Yu Liang, Paul Pu
contents	The advancement of large language models (LLMs) has made it difficult to differentiate human-written text from AI-generated text. Several AI-text detectors have been developed in response, which typically utilize a fixed global threshold (e.g., $θ= 0.5$) to classify machine-generated text. However, one universal threshold could fail to account for distributional variations by subgroups. For example, when using a fixed threshold, detectors make more false positive errors on shorter human-written text, and more positive classifications of neurotic writing styles among long texts. These discrepancies can lead to misclassifications that disproportionately affect certain groups. We address this critical limitation by introducing FairOPT, an algorithm for group-specific threshold optimization for probabilistic AI-text detectors. We partitioned data into subgroups based on attributes (e.g., text length and writing style) and implemented FairOPT to learn decision thresholds for each group to reduce discrepancy. FairOPT showed notable discrepancy mitigation across nine detectors and three heterogeneous datasets, and the remarkable mitigation of the minimax problem by decreasing overall discrepancy 27.4% across five metrics while minimally sacrificing accuracy by 0.005%. Our framework paves the way for more robust classification in AI-generated content detection via post-processing. We release our data, code, and project information at URL.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_04528
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection Jung, Minseok Panizo, Cynthia Fuertes Dugan, Liam R., Yi Fung Chen, Pin-Yu Liang, Paul Pu Computation and Language Machine Learning The advancement of large language models (LLMs) has made it difficult to differentiate human-written text from AI-generated text. Several AI-text detectors have been developed in response, which typically utilize a fixed global threshold (e.g., $θ= 0.5$) to classify machine-generated text. However, one universal threshold could fail to account for distributional variations by subgroups. For example, when using a fixed threshold, detectors make more false positive errors on shorter human-written text, and more positive classifications of neurotic writing styles among long texts. These discrepancies can lead to misclassifications that disproportionately affect certain groups. We address this critical limitation by introducing FairOPT, an algorithm for group-specific threshold optimization for probabilistic AI-text detectors. We partitioned data into subgroups based on attributes (e.g., text length and writing style) and implemented FairOPT to learn decision thresholds for each group to reduce discrepancy. FairOPT showed notable discrepancy mitigation across nine detectors and three heterogeneous datasets, and the remarkable mitigation of the minimax problem by decreasing overall discrepancy 27.4% across five metrics while minimally sacrificing accuracy by 0.005%. Our framework paves the way for more robust classification in AI-generated content detection via post-processing. We release our data, code, and project information at URL.
title	Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2502.04528

Similar Items