Saved in:
| Main Authors: | Thebaud, Thomas, Lan, Gaël Le, Larcher, Anthony |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.08918 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks
by: Shen, Huanming, et al.
Published: (2025)
by: Shen, Huanming, et al.
Published: (2025)
A Survey of Threats Against Voice Authentication and Anti-Spoofing Systems
by: Kamel, Kamel, et al.
Published: (2025)
by: Kamel, Kamel, et al.
Published: (2025)
DITTO: A Spoofing Attack Framework on Watermarked LLMs via Knowledge Distillation
by: An, Hyeseon, et al.
Published: (2025)
by: An, Hyeseon, et al.
Published: (2025)
Experimental Validation of Sensor Fusion-based GNSS Spoofing Attack Detection Framework for Autonomous Vehicles
by: Dasgupta, Sagar, et al.
Published: (2024)
by: Dasgupta, Sagar, et al.
Published: (2024)
Discovering Spoofing Attempts on Language Model Watermarks
by: Gloaguen, Thibaud, et al.
Published: (2024)
by: Gloaguen, Thibaud, et al.
Published: (2024)
GPS Spoofing Attack Detection in Autonomous Vehicles Using Adaptive DBSCAN
by: Mohammadi, Ahmad, et al.
Published: (2025)
by: Mohammadi, Ahmad, et al.
Published: (2025)
Entropy-Synchronized Neural Hashing for Unsupervised Ransomware Detection
by: Idliman, Peter, et al.
Published: (2025)
by: Idliman, Peter, et al.
Published: (2025)
Reimagining Safety Alignment with An Image
by: Xia, Yifan, et al.
Published: (2025)
by: Xia, Yifan, et al.
Published: (2025)
Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems
by: Levi, Patrick
Published: (2026)
by: Levi, Patrick
Published: (2026)
Generalizing Speaker Verification for Spoof Awareness in the Embedding Space
by: Liu, Xuechen, et al.
Published: (2024)
by: Liu, Xuechen, et al.
Published: (2024)
Biometrics in Extended Reality: A Review
by: Agarwal, Ayush, et al.
Published: (2024)
by: Agarwal, Ayush, et al.
Published: (2024)
Unsupervised Threat Hunting using Continuous Bag-of-Terms-and-Time (CBoTT)
by: Kayhan, Varol, et al.
Published: (2024)
by: Kayhan, Varol, et al.
Published: (2024)
Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations
by: Acharya, Pawan, et al.
Published: (2026)
by: Acharya, Pawan, et al.
Published: (2026)
The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense
by: Lan, Qianlong, et al.
Published: (2026)
by: Lan, Qianlong, et al.
Published: (2026)
Agent Safety Alignment via Reinforcement Learning
by: Sha, Zeyang, et al.
Published: (2025)
by: Sha, Zeyang, et al.
Published: (2025)
EVA: Editing for Versatile Alignment against Jailbreaks
by: Wang, Yi, et al.
Published: (2026)
by: Wang, Yi, et al.
Published: (2026)
UK AISI Alignment Evaluation Case-Study
by: Souly, Alexandra, et al.
Published: (2026)
by: Souly, Alexandra, et al.
Published: (2026)
LLM-Driven Feature-Level Adversarial Attacks on Android Malware Detectors
by: Lan, Tianwei, et al.
Published: (2025)
by: Lan, Tianwei, et al.
Published: (2025)
Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
by: Ferrand, Jean-Charles Noirot, et al.
Published: (2025)
by: Ferrand, Jean-Charles Noirot, et al.
Published: (2025)
Measuring Safety Alignment Effects in Autonomous Security Agents
by: David, Isaac, et al.
Published: (2026)
by: David, Isaac, et al.
Published: (2026)
Sequential Behavioral Watermarking for LLM Agents
by: An, Hyeseon, et al.
Published: (2026)
by: An, Hyeseon, et al.
Published: (2026)
VisuoAlign: Safety Alignment of LVLMs with Multimodal Tree Search
by: Li, MingSheng, et al.
Published: (2025)
by: Li, MingSheng, et al.
Published: (2025)
FreakOut-LLM: The Effect of Emotional Stimuli on Safety Alignment
by: Kuznetsov, Daniel, et al.
Published: (2026)
by: Kuznetsov, Daniel, et al.
Published: (2026)
Identification of Malicious Posts on the Dark Web Using Supervised Machine Learning
by: Filho, Sebastião Alves de Jesus, et al.
Published: (2025)
by: Filho, Sebastião Alves de Jesus, et al.
Published: (2025)
Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment
by: Shen, Yaling, et al.
Published: (2025)
by: Shen, Yaling, et al.
Published: (2025)
Ablating Safety: Mechanisms for Removing Alignment in Language Models for Security Applications
by: David, Isaac, et al.
Published: (2026)
by: David, Isaac, et al.
Published: (2026)
Matching Ranks Over Probability Yields Truly Deep Safety Alignment
by: Vega, Jason, et al.
Published: (2025)
by: Vega, Jason, et al.
Published: (2025)
Defensive Refusal Bias: How Safety Alignment Fails Cyber Defenders
by: Campbell, David, et al.
Published: (2026)
by: Campbell, David, et al.
Published: (2026)
PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
by: Li, Nanxi, et al.
Published: (2025)
by: Li, Nanxi, et al.
Published: (2025)
Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution
by: Shi, Guoxin, et al.
Published: (2026)
by: Shi, Guoxin, et al.
Published: (2026)
When Alignment Isn't Enough: Response-Path Attacks on LLM Agents
by: Luo, Mingyu, et al.
Published: (2026)
by: Luo, Mingyu, et al.
Published: (2026)
SpoofTrackBench: Interpretable AI for Spoof-Aware UAV Tracking and Benchmarking
by: Le, Van, et al.
Published: (2025)
by: Le, Van, et al.
Published: (2025)
AgentMark: Utility-Preserving Behavioral Watermarking for Agents
by: Huang, Kaibo, et al.
Published: (2026)
by: Huang, Kaibo, et al.
Published: (2026)
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
by: Yin, Qingyu, et al.
Published: (2025)
by: Yin, Qingyu, et al.
Published: (2025)
On the Impossibility of Separating Intelligence from Judgment: The Computational Intractability of Filtering for AI Alignment
by: Ball, Sarah, et al.
Published: (2025)
by: Ball, Sarah, et al.
Published: (2025)
SafeThinker: Reasoning about Risk to Deepen Safety Beyond Shallow Alignment
by: Fang, Xianya, et al.
Published: (2026)
by: Fang, Xianya, et al.
Published: (2026)
Multi-View Slot Attention Using Paraphrased Texts for Face Anti-Spoofing
by: Yu, Jeongmin, et al.
Published: (2025)
by: Yu, Jeongmin, et al.
Published: (2025)
An Investigation into the Performance of Non-Contrastive Self-Supervised Learning Methods for Network Intrusion Detection
by: Fard, Hamed, et al.
Published: (2025)
by: Fard, Hamed, et al.
Published: (2025)
Safety Layers in Aligned Large Language Models: The Key to LLM Security
by: Li, Shen, et al.
Published: (2024)
by: Li, Shen, et al.
Published: (2024)
Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace
by: Lan, Qianlong, et al.
Published: (2026)
by: Lan, Qianlong, et al.
Published: (2026)
Similar Items
-
Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks
by: Shen, Huanming, et al.
Published: (2025) -
A Survey of Threats Against Voice Authentication and Anti-Spoofing Systems
by: Kamel, Kamel, et al.
Published: (2025) -
DITTO: A Spoofing Attack Framework on Watermarked LLMs via Knowledge Distillation
by: An, Hyeseon, et al.
Published: (2025) -
Experimental Validation of Sensor Fusion-based GNSS Spoofing Attack Detection Framework for Autonomous Vehicles
by: Dasgupta, Sagar, et al.
Published: (2024) -
Discovering Spoofing Attempts on Language Model Watermarks
by: Gloaguen, Thibaud, et al.
Published: (2024)