:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Roh, Jaechul, Gandhi, Varun, Anilkumar, Shivani, Garg, Arin
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Cryptography and Security
Online Access:	https://arxiv.org/abs/2506.06971
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multilingual and Multi-Accent Jailbreaking of Audio LLMs
by: Roh, Jaechul, et al.
Published: (2025)

Benign Fine-Tuning Breaks Safety Alignment in Audio LLMs
by: Roh, Jaechul, et al.
Published: (2026)

R1dacted: Investigating Local Censorship in DeepSeek's R1 Language Model
by: Naseh, Ali, et al.
Published: (2025)

Minimal Prompt Perturbations Lead to Code Vulnerabilities: Prompt Fragility and Hidden-State Signals in Coding LLMs
by: Sternfeld, Alexander, et al.
Published: (2026)

Efficient and Stealthy Jailbreak Attacks via Adversarial Prompt Distillation from LLMs to SLMs
by: Li, Xiang, et al.
Published: (2025)

SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization
by: Liu, Houjun, et al.
Published: (2026)

PurpCode: Reasoning for Safer Code Generation
by: Liu, Jiawei, et al.
Published: (2025)

Fingerprinting LLMs via Prompt Injection
by: Hu, Yuepeng, et al.
Published: (2025)

FameBias: Embedding Manipulation Bias Attack in Text-to-Image Models
by: Roh, Jaechul, et al.
Published: (2024)

Security Degradation in Iterative AI Code Generation -- A Systematic Analysis of the Paradox
by: Shukla, Shivani, et al.
Published: (2025)

OverThink: Slowdown Attacks on Reasoning LLMs
by: Kumar, Abhinav, et al.
Published: (2025)

Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation
by: Mohseni, Seyedreza, et al.
Published: (2024)

Efficient Provably Secure Linguistic Steganography via Range Coding
by: Yan, Ruiyi, et al.
Published: (2026)

One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs
by: Li, Linbao, et al.
Published: (2025)

The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs
by: Berezin, Sergey, et al.
Published: (2025)

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
by: Paulus, Anselm, et al.
Published: (2024)

GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
by: Xie, Yueqi, et al.
Published: (2024)

ProSec: Fortifying Code LLMs with Proactive Security Alignment
by: Xu, Xiangzhe, et al.
Published: (2024)

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems
by: Qu, Yubin, et al.
Published: (2026)

MOCHA: Are Code Language Models Robust Against Multi-Turn Malicious Coding Prompts?
by: Wahed, Muntasir, et al.
Published: (2025)

Enhancing Robustness of AI Offensive Code Generators via Data Augmentation
by: Improta, Cristina, et al.
Published: (2023)

Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs
by: Wang, Jiawen, et al.
Published: (2025)

LockForge: Automating Paper-to-Code for Logic Locking with Multi-Agent Reasoning LLMs
by: Saha, Akashdeep, et al.
Published: (2025)

Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections
by: Maloyan, Narek, et al.
Published: (2025)

Query-Based Adversarial Prompt Generation
by: Hayase, Jonathan, et al.
Published: (2024)

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model
by: Wu, Tianyi, et al.
Published: (2026)

Deciphering the Chaos: Enhancing Jailbreak Attacks via Adversarial Prompt Translation
by: Li, Qizhang, et al.
Published: (2024)

Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts
by: Zhang, Chiyu, et al.
Published: (2025)

Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
by: Chen, Yunhao, et al.
Published: (2025)

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts
by: Xin, Yuan, et al.
Published: (2026)

From Vulnerabilities to Remediation: A Systematic Literature Review of LLMs in Code Security
by: Basic, Enna, et al.
Published: (2024)

CodeCloak: A Method for Evaluating and Mitigating Code Leakage by LLM Code Assistants
by: Noah, Amit Finkman, et al.
Published: (2024)

TSCheater: Generating High-Quality Tibetan Adversarial Texts via Visual Similarity
by: Cao, Xi, et al.
Published: (2024)

Throttling Web Agents Using Reasoning Gates
by: Kumar, Abhinav, et al.
Published: (2025)

Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning
by: Du, Xiaohu, et al.
Published: (2024)

Security Attacks on LLM-based Code Completion Tools
by: Cheng, Wen, et al.
Published: (2024)

Fun-tuning: Characterizing the Vulnerability of Proprietary LLMs to Optimization-based Prompt Injection Attacks via the Fine-Tuning Interface
by: Labunets, Andrey, et al.
Published: (2025)

Semantic-Preserving Adversarial Attacks on LLMs: An Adaptive Greedy Binary Search Approach
by: Zhang, Chong, et al.
Published: (2025)

OSLO: One-Shot Label-Only Membership Inference Attacks
by: Peng, Yuefeng, et al.
Published: (2024)

PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
by: Zhu, Kaijie, et al.
Published: (2023)