Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.16187 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912117279424512 |
|---|---|
| author | Pang, Qi Hu, Shengyuan Zheng, Wenting Smith, Virginia |
| author_facet | Pang, Qi Hu, Shengyuan Zheng, Wenting Smith, Virginia |
| contents | Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating the misuse of such AI-generated content. However, we show that common design choices in LLM watermarking schemes make the resulting systems surprisingly susceptible to attack -- leading to fundamental trade-offs in robustness, utility, and usability. To navigate these trade-offs, we rigorously study a set of simple yet effective attacks on common watermarking systems, and propose guidelines and defenses for LLM watermarking in practice. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2402_16187 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices Pang, Qi Hu, Shengyuan Zheng, Wenting Smith, Virginia Cryptography and Security Computation and Language Machine Learning Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating the misuse of such AI-generated content. However, we show that common design choices in LLM watermarking schemes make the resulting systems surprisingly susceptible to attack -- leading to fundamental trade-offs in robustness, utility, and usability. To navigate these trade-offs, we rigorously study a set of simple yet effective attacks on common watermarking systems, and propose guidelines and defenses for LLM watermarking in practice. |
| title | No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices |
| topic | Cryptography and Security Computation and Language Machine Learning |
| url | https://arxiv.org/abs/2402.16187 |