Saved in:
| Main Authors: | Yao, Hongwei, Shi, Haoran, Chen, Yidou, Jiang, Yixin, Wang, Cong, Qin, Zhan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.09593 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
BadReward: Clean-Label Poisoning of Reward Models in Text-to-Image RLHF
by: Duan, Kaiwen, et al.
Published: (2025)
by: Duan, Kaiwen, et al.
Published: (2025)
FDINet: Protecting against DNN Model Extraction via Feature Distortion Index
by: Yao, Hongwei, et al.
Published: (2023)
by: Yao, Hongwei, et al.
Published: (2023)
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint
by: Shao, Shuo, et al.
Published: (2025)
by: Shao, Shuo, et al.
Published: (2025)
Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution
by: Shao, Shuo, et al.
Published: (2024)
by: Shao, Shuo, et al.
Published: (2024)
AttackLLM: LLM-based Attack Pattern Generation for an Industrial Control System
by: Ahmed, Chuadhry Mujeeb
Published: (2025)
by: Ahmed, Chuadhry Mujeeb
Published: (2025)
Confundo: Learning to Generate Robust Poison for Practical RAG Systems
by: Hu, Haoyang, et al.
Published: (2026)
by: Hu, Haoyang, et al.
Published: (2026)
MalRAG: A Retrieval-Augmented LLM Framework for Open-set Malicious Traffic Identification
by: Luo, Xiang, et al.
Published: (2025)
by: Luo, Xiang, et al.
Published: (2025)
From Firewalls to Frontiers: AI Red-Teaming is a Domain-Specific Evolution of Cyber Red-Teaming
by: Sinha, Anusha, et al.
Published: (2025)
by: Sinha, Anusha, et al.
Published: (2025)
GraphRAG under Fire
by: Liang, Jiacheng, et al.
Published: (2025)
by: Liang, Jiacheng, et al.
Published: (2025)
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
by: Shen, Xinyue, et al.
Published: (2025)
by: Shen, Xinyue, et al.
Published: (2025)
Bias Amplification in RAG: Poisoning Knowledge Retrieval to Steer LLMs
by: Wang, Linlin, et al.
Published: (2025)
by: Wang, Linlin, et al.
Published: (2025)
RADAR: Defending RAG Dynamically against Retrieval Corruption
by: Chen, Ziyuan, et al.
Published: (2026)
by: Chen, Ziyuan, et al.
Published: (2026)
Detection and Imputation based Two-Stage Denoising Diffusion Power System Measurement Recovery under Cyber-Physical Uncertainties
by: Pei, Jianhua, et al.
Published: (2023)
by: Pei, Jianhua, et al.
Published: (2023)
Auditing Differential Privacy in the Black-Box Setting
by: Shi, Kaining, et al.
Published: (2025)
by: Shi, Kaining, et al.
Published: (2025)
Eyes-on-Me: Scalable RAG Poisoning through Transferable Attention-Steering Attractors
by: Chen, Yen-Shan, et al.
Published: (2025)
by: Chen, Yen-Shan, et al.
Published: (2025)
CleanBase: Detecting Malicious Documents in RAG Knowledge Databases
by: Jin, Weifei, et al.
Published: (2026)
by: Jin, Weifei, et al.
Published: (2026)
Ward: Provable RAG Dataset Inference via LLM Watermarks
by: Jovanović, Nikola, et al.
Published: (2024)
by: Jovanović, Nikola, et al.
Published: (2024)
Adaptive Probe-based Steering for Robust LLM Jailbreaking
by: Chen, Junxi, et al.
Published: (2026)
by: Chen, Junxi, et al.
Published: (2026)
Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents
by: Zhan, Qiusi, et al.
Published: (2025)
by: Zhan, Qiusi, et al.
Published: (2025)
PIDSMaker: Building and Evaluating Provenance-based Intrusion Detection Systems
by: Bilot, Tristan, et al.
Published: (2026)
by: Bilot, Tristan, et al.
Published: (2026)
IstGPT: LLM-based Anomaly Detection for Spatial-Temporal Graph in Industrial Systems
by: Zhang, Yuchen, et al.
Published: (2026)
by: Zhang, Yuchen, et al.
Published: (2026)
MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers
by: Wang, Zhiqiang, et al.
Published: (2025)
by: Wang, Zhiqiang, et al.
Published: (2025)
On Benchmarking Code LLMs for Android Malware Analysis
by: He, Yiling, et al.
Published: (2025)
by: He, Yiling, et al.
Published: (2025)
MERLOT: A Distilled LLM-based Mixture-of-Experts Framework for Scalable Encrypted Traffic Classification
by: Chen, Yuxuan, et al.
Published: (2024)
by: Chen, Yuxuan, et al.
Published: (2024)
Enhancing Privacy in ControlNet and Stable Diffusion via Split Learning
by: Yao, Dixi
Published: (2024)
by: Yao, Dixi
Published: (2024)
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
by: Zou, Wei, et al.
Published: (2024)
by: Zou, Wei, et al.
Published: (2024)
Adaptive Dual-Layer Web Application Firewall (ADL-WAF) Leveraging Machine Learning for Enhanced Anomaly and Threat Detection
by: Sameh, Ahmed, et al.
Published: (2025)
by: Sameh, Ahmed, et al.
Published: (2025)
Privacy-preserving Decision-focused Learning for Multi-energy Systems
by: Zhou, Yangze, et al.
Published: (2025)
by: Zhou, Yangze, et al.
Published: (2025)
ACE: A Security Architecture for LLM-Integrated App Systems
by: Li, Evan, et al.
Published: (2025)
by: Li, Evan, et al.
Published: (2025)
Empirical Perturbation Analysis of Linear System Solvers from a Data Poisoning Perspective
by: Liu, Yixin, et al.
Published: (2024)
by: Liu, Yixin, et al.
Published: (2024)
How to make Medical AI Systems safer? Simulating Vulnerabilities, and Threats in Multimodal Medical RAG System
by: Zuo, Kaiwen, et al.
Published: (2025)
by: Zuo, Kaiwen, et al.
Published: (2025)
CTFusion: A CTF-based Benchmark for LLM Agent Evaluation
by: Lee, Dongjun, et al.
Published: (2026)
by: Lee, Dongjun, et al.
Published: (2026)
Synthetic Artifact Auditing: Tracing LLM-Generated Synthetic Data Usage in Downstream Applications
by: Wu, Yixin, et al.
Published: (2025)
by: Wu, Yixin, et al.
Published: (2025)
G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems
by: Wang, Shilong, et al.
Published: (2025)
by: Wang, Shilong, et al.
Published: (2025)
Voice Jailbreak Attacks Against GPT-4o
by: Shen, Xinyue, et al.
Published: (2024)
by: Shen, Xinyue, et al.
Published: (2024)
Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution
by: Wu, Yixin, et al.
Published: (2024)
by: Wu, Yixin, et al.
Published: (2024)
State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space
by: Guo, Ji, et al.
Published: (2026)
by: Guo, Ji, et al.
Published: (2026)
Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems
by: Thornton, Scott
Published: (2026)
by: Thornton, Scott
Published: (2026)
Architecture Matters: Comparing RAG Systems under Knowledge Base Poisoning
by: Korn, Samuel
Published: (2026)
by: Korn, Samuel
Published: (2026)
Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks
by: Wang, Shuzhan, et al.
Published: (2024)
by: Wang, Shuzhan, et al.
Published: (2024)
Similar Items
-
BadReward: Clean-Label Poisoning of Reward Models in Text-to-Image RLHF
by: Duan, Kaiwen, et al.
Published: (2025) -
FDINet: Protecting against DNN Model Extraction via Feature Distortion Index
by: Yao, Hongwei, et al.
Published: (2023) -
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint
by: Shao, Shuo, et al.
Published: (2025) -
Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution
by: Shao, Shuo, et al.
Published: (2024) -
AttackLLM: LLM-based Attack Pattern Generation for an Industrial Control System
by: Ahmed, Chuadhry Mujeeb
Published: (2025)