:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kan, Chun Yan Ryan, Tran, Tommy, Yadav, Vedant, Cai, Ava, Zhu, Kevin, Li, Ruizhe, Chaudhary, Maheep
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Cryptography and Security Artificial Intelligence Computation and Language Machine Learning
Online-Zugang:	https://arxiv.org/abs/2602.18782
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Neighborhood Blending: A Lightweight Inference-Time Defense Against Membership Inference Attacks
von: Zafar, Osama, et al.
Veröffentlicht: (2026)

Weight space Detection of Backdoors in LoRA Adapters
von: Merenciano, David Puertolas, et al.
Veröffentlicht: (2026)

Zero-Shot Embedding Drift Detection: A Lightweight Defense Against Prompt Injections in LLMs
von: Sekar, Anirudh, et al.
Veröffentlicht: (2026)

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks
von: Kuo, Kevin, et al.
Veröffentlicht: (2026)

Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis
von: Xu, Zhenhao, et al.
Veröffentlicht: (2026)

Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
von: Fu, Yu, et al.
Veröffentlicht: (2024)

Evaluating Lightweight Block Cipher Payload Encryption for Real-Time CAN Traffic
von: Setterstrom, Kevin, et al.
Veröffentlicht: (2026)

SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought
von: Batra, Shourya, et al.
Veröffentlicht: (2025)

Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs
von: Yan, Dong, et al.
Veröffentlicht: (2026)

Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense
von: Zhou, Qi, et al.
Veröffentlicht: (2024)

Botnet Detection on CTU-13 Using Lightweight Machine Learning Models
von: Gurappa, Subhash, et al.
Veröffentlicht: (2026)

Efficient Adversarial Malware Defense via Trust-Based Raw Override and Confidence-Adaptive Bit-Depth Reduction
von: Chaudhary, Ayush, et al.
Veröffentlicht: (2025)

Ensemble Privacy Defense for Knowledge-Intensive LLMs against Membership Inference Attacks
von: Fu, Haowei, et al.
Veröffentlicht: (2025)

Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks
von: Xiong, Chen, et al.
Veröffentlicht: (2024)

Plato's Form: Toward Backdoor Defense-as-a-Service for LLMs with Prototype Representations
von: Chen, Chen, et al.
Veröffentlicht: (2026)

Dynamic Probabilistic Noise Injection for Membership Inference Defense
von: Forough, Javad, et al.
Veröffentlicht: (2025)

Bidirectional Intention Inference Enhances LLMs' Defense Against Multi-Turn Jailbreak Attacks
von: Tong, Haibo, et al.
Veröffentlicht: (2025)

Class-Conditional Neural Polarizer: A Lightweight and Effective Backdoor Defense by Purifying Poisoned Features
von: Zhu, Mingli, et al.
Veröffentlicht: (2025)

Deep Learning Model Inversion Attacks and Defenses: A Comprehensive Survey
von: Yang, Wencheng, et al.
Veröffentlicht: (2025)

Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks
von: Palit, Vedant
Veröffentlicht: (2025)

Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense
von: Zhang, Jiawen, et al.
Veröffentlicht: (2025)

Jailbreaking LLMs & VLMs: Mechanisms, Evaluation, and Unified Defense
von: Chen, Zejian, et al.
Veröffentlicht: (2026)

Proactive Hardening of LLM Defenses with HASTE
von: Chen, Henry, et al.
Veröffentlicht: (2026)

Membership Inference Attacks and Defenses in Federated Learning: A Survey
von: Bai, Li, et al.
Veröffentlicht: (2024)

LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution
von: Yang, Zhuoran, et al.
Veröffentlicht: (2025)

ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction
von: Wang, Che, et al.
Veröffentlicht: (2026)

Minimal Cascade Gradient Smoothing for Fast Transferable Preemptive Adversarial Defense
von: Wang, Hanrui, et al.
Veröffentlicht: (2024)

Model-Agnostic Lifelong LLM Safety via Externalized Attack-Defense Co-Evolution
von: Zhang, Xiaozhe, et al.
Veröffentlicht: (2026)

Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost
von: Yeh, Cheng-Han, et al.
Veröffentlicht: (2024)

Evaluating the Defense Potential of Machine Unlearning against Membership Inference Attacks
von: Tsiolakis, Theodoros, et al.
Veröffentlicht: (2025)

United We Defend: Collaborative Membership Inference Defenses in Federated Learning
von: Bai, Li, et al.
Veröffentlicht: (2026)

From LLMs to MLLMs to Agents: A Survey of Emerging Paradigms in Jailbreak Attacks and Defenses within LLM Ecosystem
von: Mao, Yanxu, et al.
Veröffentlicht: (2025)

Bypassing the Safety Training of Open-Source LLMs with Priming Attacks
von: Vega, Jason, et al.
Veröffentlicht: (2023)

Lite-BD: A Lightweight Black-box Backdoor Defense via Reviving Multi-Stage Image Transformations
von: Miah, Abdullah Arafat, et al.
Veröffentlicht: (2026)

Detection and Defense Against Prominent Attacks on Preconditioned LLM-Integrated Virtual Assistants
von: Chan, Chun Fai, et al.
Veröffentlicht: (2024)

Black-box Membership Inference Attacks against Fine-tuned Diffusion Models
von: Pang, Yan, et al.
Veröffentlicht: (2023)

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference
von: Li, Zhengyi, et al.
Veröffentlicht: (2026)

Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
von: Ferrand, Jean-Charles Noirot, et al.
Veröffentlicht: (2025)

Cyberscurity Threats and Defense Mechanisms in IoT network
von: Dao, Trung, et al.
Veröffentlicht: (2026)

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision
von: Mukherjee, Manisha, et al.
Veröffentlicht: (2026)