Saved in:
| Main Authors: | Eddoubi, Hicham, Abdullahi, Umar Faruk, Hassan, Fadi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.03265 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Resurgence of GCG Adversarial Attacks on Large Language Models
by: Tan, Yuting, et al.
Published: (2025)
by: Tan, Yuting, et al.
Published: (2025)
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
by: Mu, Junjie, et al.
Published: (2025)
by: Mu, Junjie, et al.
Published: (2025)
GCG Attack On A Diffusion LLM
by: Neyroud, Ruben, et al.
Published: (2025)
by: Neyroud, Ruben, et al.
Published: (2025)
Advancing Adversarial Suffix Transfer Learning on Aligned Large Language Models
by: Liu, Hongfu, et al.
Published: (2024)
by: Liu, Hongfu, et al.
Published: (2024)
Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models
by: Li, Xiao, et al.
Published: (2024)
by: Li, Xiao, et al.
Published: (2024)
RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors
by: Eddoubi, Hicham, et al.
Published: (2025)
by: Eddoubi, Hicham, et al.
Published: (2025)
Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization
by: Tang, Haochun, et al.
Published: (2026)
by: Tang, Haochun, et al.
Published: (2026)
Adversarial Suffix Filtering: a Defense Pipeline for LLMs
by: Khachaturov, David, et al.
Published: (2025)
by: Khachaturov, David, et al.
Published: (2025)
Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
by: Yang, Xiaoxue, et al.
Published: (2025)
by: Yang, Xiaoxue, et al.
Published: (2025)
A Triadic Suffix Tokenization Scheme for Numerical Reasoning
by: Chetverina, Olga
Published: (2026)
by: Chetverina, Olga
Published: (2026)
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
by: Zhao, Yunhan, et al.
Published: (2024)
by: Zhao, Yunhan, et al.
Published: (2024)
Sampling-aware Adversarial Attacks Against Large Language Models
by: Beyer, Tim, et al.
Published: (2025)
by: Beyer, Tim, et al.
Published: (2025)
DPad: Efficient Diffusion Language Models with Suffix Dropout
by: Chen, Xinhua, et al.
Published: (2025)
by: Chen, Xinhua, et al.
Published: (2025)
Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models
by: Wang, Yubo, et al.
Published: (2024)
by: Wang, Yubo, et al.
Published: (2024)
A Closer Look at Adversarial Suffix Learning for Jailbreaking LLMs: Augmented Adversarial Trigger Learning
by: Wang, Zhe, et al.
Published: (2025)
by: Wang, Zhe, et al.
Published: (2025)
Adversarial Evasion Attack Efficiency against Large Language Models
by: Vitorino, João, et al.
Published: (2024)
by: Vitorino, João, et al.
Published: (2024)
Token-Modification Adversarial Attacks for Natural Language Processing: A Survey
by: Roth, Tom, et al.
Published: (2021)
by: Roth, Tom, et al.
Published: (2021)
REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
by: Geisler, Simon, et al.
Published: (2025)
by: Geisler, Simon, et al.
Published: (2025)
Beyond Semantic Manipulation: Token-Space Attacks on Reward Models
by: Zhang, Yuheng, et al.
Published: (2026)
by: Zhang, Yuheng, et al.
Published: (2026)
Beyond Early-Token Bias: Model-Specific and Language-Specific Position Effects in Multilingual LLMs
by: Menschikov, Mikhail, et al.
Published: (2025)
by: Menschikov, Mikhail, et al.
Published: (2025)
Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification
by: Faruk, Tanjim Bin
Published: (2024)
by: Faruk, Tanjim Bin
Published: (2024)
Universal and Transferable Adversarial Attack on Large Language Models Using Exponentiated Gradient Descent
by: Biswas, Sajib, et al.
Published: (2025)
by: Biswas, Sajib, et al.
Published: (2025)
Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
by: Winninger, Thomas, et al.
Published: (2025)
by: Winninger, Thomas, et al.
Published: (2025)
Adversarial Attack on Large Language Models using Exponentiated Gradient Descent
by: Biswas, Sajib, et al.
Published: (2025)
by: Biswas, Sajib, et al.
Published: (2025)
Policy Disruption in Reinforcement Learning:Adversarial Attack with Large Language Models and Critical State Identification
by: Jiang, Junyong, et al.
Published: (2025)
by: Jiang, Junyong, et al.
Published: (2025)
Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training
by: Tran, Toan, et al.
Published: (2025)
by: Tran, Toan, et al.
Published: (2025)
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
by: Basani, Advik Raj, et al.
Published: (2024)
by: Basani, Advik Raj, et al.
Published: (2024)
Detecting Dark Patterns in User Interfaces Using Logistic Regression and Bag-of-Words Representation
by: Umar, Aliyu, et al.
Published: (2024)
by: Umar, Aliyu, et al.
Published: (2024)
Vision Transformer with Adversarial Indicator Token against Adversarial Attacks in Radio Signal Classifications
by: Zhang, Lu, et al.
Published: (2025)
by: Zhang, Lu, et al.
Published: (2025)
Adversarial Attacks on Audio Deepfake Detection: A Benchmark and Comparative Study
by: Uddin, Kutub, et al.
Published: (2025)
by: Uddin, Kutub, et al.
Published: (2025)
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
INTERPOS: Interaction Rhythm Guided Positional Morphing for Mobile App Recommender Systems
by: Maqbool, M. H., et al.
Published: (2025)
by: Maqbool, M. H., et al.
Published: (2025)
Adversarial Attacks on Large Language Models Using Regularized Relaxation
by: Chacko, Samuel Jacob, et al.
Published: (2024)
by: Chacko, Samuel Jacob, et al.
Published: (2024)
DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models
by: Sun, Ye, et al.
Published: (2026)
by: Sun, Ye, et al.
Published: (2026)
Consistent Valid Physically-Realizable Adversarial Attack against Crowd-flow Prediction Models
by: Ali, Hassan, et al.
Published: (2023)
by: Ali, Hassan, et al.
Published: (2023)
Time-To-Inconsistency: A Survival Analysis of Large Language Model Robustness to Adversarial Attacks
by: Li, Yubo, et al.
Published: (2025)
by: Li, Yubo, et al.
Published: (2025)
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
by: Liao, Zeyi, et al.
Published: (2024)
by: Liao, Zeyi, et al.
Published: (2024)
AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
by: Zhang, Jiaming, et al.
Published: (2024)
by: Zhang, Jiaming, et al.
Published: (2024)
Revisiting Character-level Adversarial Attacks for Language Models
by: Rocamora, Elias Abad, et al.
Published: (2024)
by: Rocamora, Elias Abad, et al.
Published: (2024)
ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models
by: Piehl, Mitchell, et al.
Published: (2026)
by: Piehl, Mitchell, et al.
Published: (2026)
Similar Items
-
The Resurgence of GCG Adversarial Attacks on Large Language Models
by: Tan, Yuting, et al.
Published: (2025) -
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
by: Mu, Junjie, et al.
Published: (2025) -
GCG Attack On A Diffusion LLM
by: Neyroud, Ruben, et al.
Published: (2025) -
Advancing Adversarial Suffix Transfer Learning on Aligned Large Language Models
by: Liu, Hongfu, et al.
Published: (2024) -
Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models
by: Li, Xiao, et al.
Published: (2024)