Saved in:
| Main Authors: | Zhou, Tong, Zhao, Xuandong, Xu, Xiaolin, Ren, Shaolei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.01946 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction
by: Zhou, Tong, et al.
Published: (2025)
by: Zhou, Tong, et al.
Published: (2025)
Model Provenance Testing for Large Language Models
by: Nikolic, Ivica, et al.
Published: (2025)
by: Nikolic, Ivica, et al.
Published: (2025)
Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning
by: An, Li, et al.
Published: (2025)
by: An, Li, et al.
Published: (2025)
Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs
by: Liu, Yepeng, et al.
Published: (2025)
by: Liu, Yepeng, et al.
Published: (2025)
Prompt Stealing Attacks Against Large Language Models
by: Sha, Zeyang, et al.
Published: (2024)
by: Sha, Zeyang, et al.
Published: (2024)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
by: Cai, Will, et al.
Published: (2025)
by: Cai, Will, et al.
Published: (2025)
Composite Backdoor Attacks Against Large Language Models
by: Huang, Hai, et al.
Published: (2023)
by: Huang, Hai, et al.
Published: (2023)
Data Defenses Against Large Language Models
by: Agnew, William, et al.
Published: (2024)
by: Agnew, William, et al.
Published: (2024)
Eguard: Defending LLM Embeddings Against Inversion Attacks via Text Mutual Information Optimization
by: Liu, Tiantian, et al.
Published: (2024)
by: Liu, Tiantian, et al.
Published: (2024)
garak: A Framework for Security Probing Large Language Models
by: Derczynski, Leon, et al.
Published: (2024)
by: Derczynski, Leon, et al.
Published: (2024)
Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption
by: Liu, Yepeng, et al.
Published: (2025)
by: Liu, Yepeng, et al.
Published: (2025)
Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
by: Zhao, Xuandong, et al.
Published: (2024)
by: Zhao, Xuandong, et al.
Published: (2024)
Waterfall: Framework for Robust and Scalable Text Watermarking and Provenance for LLMs
by: Lau, Gregory Kang Ruey, et al.
Published: (2024)
by: Lau, Gregory Kang Ruey, et al.
Published: (2024)
Text Embedding Inversion Security for Multilingual Language Models
by: Chen, Yiyi, et al.
Published: (2024)
by: Chen, Yiyi, et al.
Published: (2024)
Building Resilient SMEs: Harnessing Large Language Models for Cyber Security in Australia
by: Kereopa-Yorke, Benjamin
Published: (2023)
by: Kereopa-Yorke, Benjamin
Published: (2023)
The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation
by: Xiong, Alexander, et al.
Published: (2025)
by: Xiong, Alexander, et al.
Published: (2025)
$PD^3F$: A Pluggable and Dynamic DoS-Defense Framework Against Resource Consumption Attacks Targeting Large Language Models
by: Zhang, Yuanhe, et al.
Published: (2025)
by: Zhang, Yuanhe, et al.
Published: (2025)
TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection
by: Sander, Tom, et al.
Published: (2026)
by: Sander, Tom, et al.
Published: (2026)
An Investigation into Misuse of Java Security APIs by Large Language Models
by: Mousavi, Zahra, et al.
Published: (2024)
by: Mousavi, Zahra, et al.
Published: (2024)
Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks
by: Zhao, Jiawei, et al.
Published: (2024)
by: Zhao, Jiawei, et al.
Published: (2024)
MPMA: Preference Manipulation Attack Against Model Context Protocol
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
Securing Genomic Data Against Inference Attacks in Federated Learning Environments
by: Pathade, Chetan, et al.
Published: (2025)
by: Pathade, Chetan, et al.
Published: (2025)
GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models
by: Wang, Zilong, et al.
Published: (2025)
by: Wang, Zilong, et al.
Published: (2025)
MetaSeal: Defending Against Image Attribution Forgery Through Content-Dependent Cryptographic Watermarks
by: Zhou, Tong, et al.
Published: (2025)
by: Zhou, Tong, et al.
Published: (2025)
ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
by: Wang, Yidan, et al.
Published: (2025)
by: Wang, Yidan, et al.
Published: (2025)
StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Organization Structures
by: Li, Bangxin, et al.
Published: (2024)
by: Li, Bangxin, et al.
Published: (2024)
Security and Privacy Challenges of Large Language Models: A Survey
by: Das, Badhan Chandra, et al.
Published: (2024)
by: Das, Badhan Chandra, et al.
Published: (2024)
Privacy-Preserving Instructions for Aligning Large Language Models
by: Yu, Da, et al.
Published: (2024)
by: Yu, Da, et al.
Published: (2024)
Majority Bit-Aware Watermarking For Large Language Models
by: Xu, Jiahao, et al.
Published: (2025)
by: Xu, Jiahao, et al.
Published: (2025)
SOS! Soft Prompt Attack Against Open-Source Large Language Models
by: Yang, Ziqing, et al.
Published: (2024)
by: Yang, Ziqing, et al.
Published: (2024)
Securing Multi-turn Conversational Language Models From Distributed Backdoor Triggers
by: Tong, Terry, et al.
Published: (2024)
by: Tong, Terry, et al.
Published: (2024)
Enhance Robustness of Language Models Against Variation Attack through Graph Integration
by: Xiong, Zi, et al.
Published: (2024)
by: Xiong, Zi, et al.
Published: (2024)
SecureLLM: Using Compositionality to Build Provably Secure Language Models for Private, Sensitive, and Secret Data
by: Alabdulkareem, Abdulrahman, et al.
Published: (2024)
by: Alabdulkareem, Abdulrahman, et al.
Published: (2024)
BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
CTRAP: Embedding Collapse Trap to Safeguard Large Language Models from Harmful Fine-Tuning
by: Yi, Biao, et al.
Published: (2025)
by: Yi, Biao, et al.
Published: (2025)
JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models
by: Feng, Yingchaojie, et al.
Published: (2024)
by: Feng, Yingchaojie, et al.
Published: (2024)
SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning
by: Zhou, Kaiwen, et al.
Published: (2025)
by: Zhou, Kaiwen, et al.
Published: (2025)
Institutional Platform for Secure Self-Service Large Language Model Exploration
by: Bumgardner, V. K. Cody, et al.
Published: (2024)
by: Bumgardner, V. K. Cody, et al.
Published: (2024)
Similar Items
-
DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack
by: Li, Hao, et al.
Published: (2025) -
ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction
by: Zhou, Tong, et al.
Published: (2025) -
Model Provenance Testing for Large Language Models
by: Nikolic, Ivica, et al.
Published: (2025) -
Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning
by: An, Li, et al.
Published: (2025) -
Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs
by: Liu, Yepeng, et al.
Published: (2025)