Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Pang, Qi, Hu, Shengyuan, Zheng, Wenting, Smith, Virginia
Format:	Preprint
Published:	2024
Subjects:	Cryptography and Security Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2402.16187
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating the misuse of such AI-generated content. However, we show that common design choices in LLM watermarking schemes make the resulting systems surprisingly susceptible to attack -- leading to fundamental trade-offs in robustness, utility, and usability. To navigate these trade-offs, we rigorously study a set of simple yet effective attacks on common watermarking systems, and propose guidelines and defenses for LLM watermarking in practice.

Similar Items