:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Eddoubi, Hicham, Abdullahi, Umar Faruk, Hassan, Fadi
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.03265
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Resurgence of GCG Adversarial Attacks on Large Language Models
by: Tan, Yuting, et al.
Published: (2025)

Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
by: Mu, Junjie, et al.
Published: (2025)

GCG Attack On A Diffusion LLM
by: Neyroud, Ruben, et al.
Published: (2025)

Advancing Adversarial Suffix Transfer Learning on Aligned Large Language Models
by: Liu, Hongfu, et al.
Published: (2024)

Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models
by: Li, Xiao, et al.
Published: (2024)

RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors
by: Eddoubi, Hicham, et al.
Published: (2025)

Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization
by: Tang, Haochun, et al.
Published: (2026)

Adversarial Suffix Filtering: a Defense Pipeline for LLMs
by: Khachaturov, David, et al.
Published: (2025)

Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses
by: Yang, Xiaoxue, et al.
Published: (2025)

A Triadic Suffix Tokenization Scheme for Numerical Reasoning
by: Chetverina, Olga
Published: (2026)

BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
by: Zhao, Yunhan, et al.
Published: (2024)

Sampling-aware Adversarial Attacks Against Large Language Models
by: Beyer, Tim, et al.
Published: (2025)

DPad: Efficient Diffusion Language Models with Suffix Dropout
by: Chen, Xinhua, et al.
Published: (2025)

Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models
by: Wang, Yubo, et al.
Published: (2024)

A Closer Look at Adversarial Suffix Learning for Jailbreaking LLMs: Augmented Adversarial Trigger Learning
by: Wang, Zhe, et al.
Published: (2025)

Adversarial Evasion Attack Efficiency against Large Language Models
by: Vitorino, João, et al.
Published: (2024)

Token-Modification Adversarial Attacks for Natural Language Processing: A Survey
by: Roth, Tom, et al.
Published: (2021)

REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective
by: Geisler, Simon, et al.
Published: (2025)

Beyond Semantic Manipulation: Token-Space Attacks on Reward Models
by: Zhang, Yuheng, et al.
Published: (2026)

Beyond Early-Token Bias: Model-Specific and Language-Specific Position Effects in Multilingual LLMs
by: Menschikov, Mikhail, et al.
Published: (2025)

Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification
by: Faruk, Tanjim Bin
Published: (2024)

Universal and Transferable Adversarial Attack on Large Language Models Using Exponentiated Gradient Descent
by: Biswas, Sajib, et al.
Published: (2025)

Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
by: Winninger, Thomas, et al.
Published: (2025)

Adversarial Attack on Large Language Models using Exponentiated Gradient Descent
by: Biswas, Sajib, et al.
Published: (2025)

Policy Disruption in Reinforcement Learning:Adversarial Attack with Large Language Models and Critical State Identification
by: Jiang, Junyong, et al.
Published: (2025)

Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training
by: Tran, Toan, et al.
Published: (2025)

GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
by: Basani, Advik Raj, et al.
Published: (2024)

Detecting Dark Patterns in User Interfaces Using Logistic Regression and Bag-of-Words Representation
by: Umar, Aliyu, et al.
Published: (2024)

Vision Transformer with Adversarial Indicator Token against Adversarial Attacks in Radio Signal Classifications
by: Zhang, Lu, et al.
Published: (2025)

Adversarial Attacks on Audio Deepfake Detection: A Benchmark and Comparative Study
by: Uddin, Kutub, et al.
Published: (2025)

Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)

INTERPOS: Interaction Rhythm Guided Positional Morphing for Mobile App Recommender Systems
by: Maqbool, M. H., et al.
Published: (2025)

Adversarial Attacks on Large Language Models Using Regularized Relaxation
by: Chacko, Samuel Jacob, et al.
Published: (2024)

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models
by: Sun, Ye, et al.
Published: (2026)

Consistent Valid Physically-Realizable Adversarial Attack against Crowd-flow Prediction Models
by: Ali, Hassan, et al.
Published: (2023)

Time-To-Inconsistency: A Survival Analysis of Large Language Model Robustness to Adversarial Attacks
by: Li, Yubo, et al.
Published: (2025)

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
by: Liao, Zeyi, et al.
Published: (2024)

AnyAttack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
by: Zhang, Jiaming, et al.
Published: (2024)

Revisiting Character-level Adversarial Attacks for Language Models
by: Rocamora, Elias Abad, et al.
Published: (2024)

ER-MIA: Black-Box Adversarial Memory Injection Attacks on Long-Term Memory-Augmented Large Language Models
by: Piehl, Mitchell, et al.
Published: (2026)