Saved in:
Bibliographic Details
Main Authors: Chataigner, Cléa, Ma, Rebecca, Ganesh, Prakhar, Chen, Yuhao, Taïk, Afaf, Creager, Elliot, Farnadi, Golnoosh
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.03563
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911199160958976
author Chataigner, Cléa
Ma, Rebecca
Ganesh, Prakhar
Chen, Yuhao
Taïk, Afaf
Creager, Elliot
Farnadi, Golnoosh
author_facet Chataigner, Cléa
Ma, Rebecca
Ganesh, Prakhar
Chen, Yuhao
Taïk, Afaf
Creager, Elliot
Farnadi, Golnoosh
contents Large language models (LLMs) are highly sensitive to subtle changes in prompt phrasing, posing challenges for reliable auditing. Prior methods often apply unconstrained prompt paraphrasing, which risk missing linguistic and demographic factors that shape authentic user interactions. We introduce AUGMENT (Automated User-Grounded Modeling and Evaluation of Natural Language Transformations), a framework for generating controlled paraphrases, grounded in user behaviors. AUGMENT leverages linguistically informed rules and enforces quality through checks on instruction adherence, semantic similarity, and realism, ensuring paraphrases are both reliable and meaningful for auditing. Through case studies on the BBQ and MMLU datasets, we show that controlled paraphrases uncover systematic weaknesses that remain obscured under unconstrained variation. These results highlight the value of the AUGMENT framework for reliable auditing.
format Preprint
id arxiv_https___arxiv_org_abs_2505_03563
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
Chataigner, Cléa
Ma, Rebecca
Ganesh, Prakhar
Chen, Yuhao
Taïk, Afaf
Creager, Elliot
Farnadi, Golnoosh
Computation and Language
Large language models (LLMs) are highly sensitive to subtle changes in prompt phrasing, posing challenges for reliable auditing. Prior methods often apply unconstrained prompt paraphrasing, which risk missing linguistic and demographic factors that shape authentic user interactions. We introduce AUGMENT (Automated User-Grounded Modeling and Evaluation of Natural Language Transformations), a framework for generating controlled paraphrases, grounded in user behaviors. AUGMENT leverages linguistically informed rules and enforces quality through checks on instruction adherence, semantic similarity, and realism, ensuring paraphrases are both reliable and meaningful for auditing. Through case studies on the BBQ and MMLU datasets, we show that controlled paraphrases uncover systematic weaknesses that remain obscured under unconstrained variation. These results highlight the value of the AUGMENT framework for reliable auditing.
title Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
topic Computation and Language
url https://arxiv.org/abs/2505.03563