Saved in:
| Main Authors: | Wang, Bingzheng, Gu, Xiaoyan, Xu, Hongbo, Li, Hongcheng, Yu, Zimo, Zhou, Jiang, Wang, Weiping |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.01765 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models
by: Chen, Linzhi, et al.
Published: (2025)
by: Chen, Linzhi, et al.
Published: (2025)
Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models
by: Yu, Miao, et al.
Published: (2025)
by: Yu, Miao, et al.
Published: (2025)
BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation
by: Yu, Haiyang, et al.
Published: (2024)
by: Yu, Haiyang, et al.
Published: (2024)
Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency
by: Pal, Soumyadeep, et al.
Published: (2024)
by: Pal, Soumyadeep, et al.
Published: (2024)
Lightweight and Fast Backdoor Model Detection
by: Yu, Yinbo, et al.
Published: (2026)
by: Yu, Yinbo, et al.
Published: (2026)
DeBackdoor: A Deductive Framework for Detecting Backdoor Attacks on Deep Models with Limited Data
by: Popovic, Dorde, et al.
Published: (2025)
by: Popovic, Dorde, et al.
Published: (2025)
Backdoor4Good: Benchmarking Beneficial Uses of Backdoors in LLMs
by: Li, Yige, et al.
Published: (2026)
by: Li, Yige, et al.
Published: (2026)
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model
by: Lin, Weilin, et al.
Published: (2025)
by: Lin, Weilin, et al.
Published: (2025)
AutoBackdoor: Automating Backdoor Attacks via LLM Agents
by: Li, Yige, et al.
Published: (2025)
by: Li, Yige, et al.
Published: (2025)
Backdoor Samples Detection Based on Perturbation Discrepancy Consistency in Pre-trained Language Models
by: Peng, Zuquan, et al.
Published: (2025)
by: Peng, Zuquan, et al.
Published: (2025)
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
by: Ge, Huaizhi, et al.
Published: (2024)
by: Ge, Huaizhi, et al.
Published: (2024)
Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward
by: Guo, Weiyang, et al.
Published: (2026)
by: Guo, Weiyang, et al.
Published: (2026)
Semantic-level Backdoor Attack against Text-to-Image Diffusion Models
by: Chen, Tianxin, et al.
Published: (2026)
by: Chen, Tianxin, et al.
Published: (2026)
Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models
by: Li, Xi, et al.
Published: (2024)
by: Li, Xi, et al.
Published: (2024)
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
by: An, Shengwei, et al.
Published: (2023)
by: An, Shengwei, et al.
Published: (2023)
Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
by: Jiang, Peihai, et al.
Published: (2025)
by: Jiang, Peihai, et al.
Published: (2025)
CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing
by: Jin, Haibo, et al.
Published: (2021)
by: Jin, Haibo, et al.
Published: (2021)
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models
by: Xu, Jiashu, et al.
Published: (2023)
by: Xu, Jiashu, et al.
Published: (2023)
Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense
by: Zhou, Qi, et al.
Published: (2024)
by: Zhou, Qi, et al.
Published: (2024)
PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
by: Li, Wei, et al.
Published: (2024)
by: Li, Wei, et al.
Published: (2024)
Towards Backdoor Stealthiness in Model Parameter Space
by: Xu, Xiaoyun, et al.
Published: (2025)
by: Xu, Xiaoyun, et al.
Published: (2025)
BAN: Detecting Backdoors Activated by Adversarial Neuron Noise
by: Xu, Xiaoyun, et al.
Published: (2024)
by: Xu, Xiaoyun, et al.
Published: (2024)
Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning
by: Song, Baogang, et al.
Published: (2025)
by: Song, Baogang, et al.
Published: (2025)
Beyond Immediate Activation: Temporally Decoupled Backdoor Attacks on Time Series Forecasting
by: Liu, Zhixin, et al.
Published: (2026)
by: Liu, Zhixin, et al.
Published: (2026)
The Ripple Effect: On Unforeseen Complications of Backdoor Attacks
by: Zhang, Rui, et al.
Published: (2025)
by: Zhang, Rui, et al.
Published: (2025)
BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning
by: Tie, Guiyao, et al.
Published: (2026)
by: Tie, Guiyao, et al.
Published: (2026)
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors
by: Mo, Yichuan, et al.
Published: (2024)
by: Mo, Yichuan, et al.
Published: (2024)
Concept-Guided Backdoor Attack on Vision Language Models
by: Shen, Haoyu, et al.
Published: (2025)
by: Shen, Haoyu, et al.
Published: (2025)
Fast and Lightweight Backdoor Detection via Head Random Probing
by: Yu, Yinbo, et al.
Published: (2026)
by: Yu, Yinbo, et al.
Published: (2026)
OCGEC: One-class Graph Embedding Classification for DNN Backdoor Detection
by: Jiang, Haoyu, et al.
Published: (2023)
by: Jiang, Haoyu, et al.
Published: (2023)
Impart: An Imperceptible and Effective Label-Specific Backdoor Attack
by: Zhao, Jingke, et al.
Published: (2024)
by: Zhao, Jingke, et al.
Published: (2024)
Coward: Collision-based OOD Watermarking for Practical Proactive Federated Backdoor Detection
by: Li, Wenjie, et al.
Published: (2025)
by: Li, Wenjie, et al.
Published: (2025)
Backdooring Bias in Large Language Models
by: Das, Anudeep, et al.
Published: (2026)
by: Das, Anudeep, et al.
Published: (2026)
DropVLA: An Action-Level Backdoor Attack on Vision-Language-Action Models
by: Xu, Zonghuan, et al.
Published: (2025)
by: Xu, Zonghuan, et al.
Published: (2025)
Delayed Backdoor Attacks: Exploring the Temporal Dimension as a New Attack Surface in Pre-Trained Models
by: Ding, Zikang, et al.
Published: (2026)
by: Ding, Zikang, et al.
Published: (2026)
Backdooring Bias ($B^2$) into Stable Diffusion Models
by: Naseh, Ali, et al.
Published: (2024)
by: Naseh, Ali, et al.
Published: (2024)
Flashy Backdoor: Real-world Environment Backdoor Attack on SNNs with DVS Cameras
by: Riaño, Roberto, et al.
Published: (2024)
by: Riaño, Roberto, et al.
Published: (2024)
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
by: Li, Yuetai, et al.
Published: (2024)
by: Li, Yuetai, et al.
Published: (2024)
Backdooring Masked Diffusion Language Models
by: Cao, Daniel Yiming, et al.
Published: (2026)
by: Cao, Daniel Yiming, et al.
Published: (2026)
Backdoor Graph Condensation
by: Wu, Jiahao, et al.
Published: (2024)
by: Wu, Jiahao, et al.
Published: (2024)
Similar Items
-
Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models
by: Chen, Linzhi, et al.
Published: (2025) -
Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models
by: Yu, Miao, et al.
Published: (2025) -
BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation
by: Yu, Haiyang, et al.
Published: (2024) -
Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency
by: Pal, Soumyadeep, et al.
Published: (2024) -
Lightweight and Fast Backdoor Model Detection
by: Yu, Yinbo, et al.
Published: (2026)