Guardado en:
| Autor principal: | Topol, Zvi |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2605.12869 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Quantifying Loss Aversion in Cyber Adversaries via LLM Analysis
por: Hans, Soham, et al.
Publicado: (2025)
por: Hans, Soham, et al.
Publicado: (2025)
TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering
por: Hossain, Saad, et al.
Publicado: (2026)
por: Hossain, Saad, et al.
Publicado: (2026)
Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks
por: You, Doohee
Publicado: (2026)
por: You, Doohee
Publicado: (2026)
Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts
por: Siddiky, Md Nurul Absar
Publicado: (2026)
por: Siddiky, Md Nurul Absar
Publicado: (2026)
Robustness Analysis of Machine Learning Models for IoT Intrusion Detection Under Data Poisoning Attacks
por: Wulnye, Fortunatus Aabangbio, et al.
Publicado: (2026)
por: Wulnye, Fortunatus Aabangbio, et al.
Publicado: (2026)
Quantifying Frontier LLM Capabilities for Container Sandbox Escape
por: Marchand, Rahul, et al.
Publicado: (2026)
por: Marchand, Rahul, et al.
Publicado: (2026)
Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization
por: Li, Xurui, et al.
Publicado: (2025)
por: Li, Xurui, et al.
Publicado: (2025)
LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis
por: Fayyazi, Reza, et al.
Publicado: (2025)
por: Fayyazi, Reza, et al.
Publicado: (2025)
Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents
por: Nawal, Aditya, et al.
Publicado: (2026)
por: Nawal, Aditya, et al.
Publicado: (2026)
LLM-based Multi-class Attack Analysis and Mitigation Framework in IoT/IIoT Networks
por: Ikbarieh, Seif, et al.
Publicado: (2025)
por: Ikbarieh, Seif, et al.
Publicado: (2025)
Attention Masks Help Adversarial Attacks to Bypass Safety Detectors
por: Shi, Yunfan
Publicado: (2024)
por: Shi, Yunfan
Publicado: (2024)
LLM-Safety Evaluations Lack Robustness
por: Beyer, Tim, et al.
Publicado: (2025)
por: Beyer, Tim, et al.
Publicado: (2025)
RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis
por: Tan, Xue, et al.
Publicado: (2024)
por: Tan, Xue, et al.
Publicado: (2024)
AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents
por: Wu, Yixin, et al.
Publicado: (2025)
por: Wu, Yixin, et al.
Publicado: (2025)
Quantifying the Noise of Structural Perturbations on Graph Adversarial Attacks
por: Fang, Junyuan, et al.
Publicado: (2025)
por: Fang, Junyuan, et al.
Publicado: (2025)
aiXamine: Simplified LLM Safety and Security
por: Deniz, Fatih, et al.
Publicado: (2025)
por: Deniz, Fatih, et al.
Publicado: (2025)
Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis
por: Zhang, Haoyu, et al.
Publicado: (2026)
por: Zhang, Haoyu, et al.
Publicado: (2026)
CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent
por: Ning, Liang-bo, et al.
Publicado: (2025)
por: Ning, Liang-bo, et al.
Publicado: (2025)
Execution Is the New Attack Surface: Survivability-Aware Agentic Crypto Trading with OpenClaw-Style Local Executors
por: Borjigin, Ailiya, et al.
Publicado: (2026)
por: Borjigin, Ailiya, et al.
Publicado: (2026)
CompressionAttack: Exploiting Prompt Compression as a New Attack Surface in LLM-Powered Agents
por: Liu, Zesen, et al.
Publicado: (2025)
por: Liu, Zesen, et al.
Publicado: (2025)
LoopTrap: Termination Poisoning Attacks on LLM Agents
por: Xu, Huiyu, et al.
Publicado: (2026)
por: Xu, Huiyu, et al.
Publicado: (2026)
Targeted Bit-Flip Attacks on LLM-Based Agents
por: Wang, Jialai, et al.
Publicado: (2026)
por: Wang, Jialai, et al.
Publicado: (2026)
Exploring Backdoor Attack and Defense for LLM-empowered Recommendations
por: Ning, Liangbo, et al.
Publicado: (2025)
por: Ning, Liangbo, et al.
Publicado: (2025)
SAGE: A Generic Framework for LLM Safety Evaluation
por: Jindal, Madhur, et al.
Publicado: (2025)
por: Jindal, Madhur, et al.
Publicado: (2025)
AISA: Awakening Intrinsic Safety Awareness in Large Language Models against Jailbreak Attacks
por: Song, Weiming, et al.
Publicado: (2026)
por: Song, Weiming, et al.
Publicado: (2026)
SurvAttack: Black-Box Attack On Survival Models through Ontology-Informed EHR Perturbation
por: Kerdabadi, Mohsen Nayebi, et al.
Publicado: (2024)
por: Kerdabadi, Mohsen Nayebi, et al.
Publicado: (2024)
Cloud-based XAI Services for Assessing Open Repository Models Under Adversarial Attacks
por: Wang, Zerui, et al.
Publicado: (2024)
por: Wang, Zerui, et al.
Publicado: (2024)
AutoBackdoor: Automating Backdoor Attacks via LLM Agents
por: Li, Yige, et al.
Publicado: (2025)
por: Li, Yige, et al.
Publicado: (2025)
Multi-Stage Prompt Inference Attacks on Enterprise LLM Systems
por: Balashov, Andrii, et al.
Publicado: (2025)
por: Balashov, Andrii, et al.
Publicado: (2025)
ToolTweak: An Attack on Tool Selection in LLM-based Agents
por: Sneh, Jonathan, et al.
Publicado: (2025)
por: Sneh, Jonathan, et al.
Publicado: (2025)
Human-Imperceptible Retrieval Poisoning Attacks in LLM-Powered Applications
por: Zhang, Quan, et al.
Publicado: (2024)
por: Zhang, Quan, et al.
Publicado: (2024)
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
por: Shi, Jiawen, et al.
Publicado: (2024)
por: Shi, Jiawen, et al.
Publicado: (2024)
Distillability of LLM Security Logic: Predicting Attack Success Rate of Outline Filling Attack via Ranking Regression
por: Zhang, Tianyu, et al.
Publicado: (2025)
por: Zhang, Tianyu, et al.
Publicado: (2025)
FreakOut-LLM: The Effect of Emotional Stimuli on Safety Alignment
por: Kuznetsov, Daniel, et al.
Publicado: (2026)
por: Kuznetsov, Daniel, et al.
Publicado: (2026)
A Method for Enhancing the Safety of Large Model Generation Based on Multi-dimensional Attack and Defense
por: Zhai, Keke
Publicado: (2024)
por: Zhai, Keke
Publicado: (2024)
UNSEEN: A Cross-Stack LLM Unlearning Defense against AR-LLM Social Engineering Attacks
por: Yu, Tianlong, et al.
Publicado: (2026)
por: Yu, Tianlong, et al.
Publicado: (2026)
When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks
por: Chen, Shenyang, et al.
Publicado: (2026)
por: Chen, Shenyang, et al.
Publicado: (2026)
Under the Hood of SKILL.md: Semantic Supply-chain Attacks on AI Agent Skill Registry
por: Saha, Shoumik, et al.
Publicado: (2026)
por: Saha, Shoumik, et al.
Publicado: (2026)
Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations
por: Teja, Lekkala Sai, et al.
Publicado: (2025)
por: Teja, Lekkala Sai, et al.
Publicado: (2025)
From Incomplete Architecture to Quantified Risk: Multimodal LLM-Driven Security Assessment for Cyber-Physical Systems
por: Huang, Shaofei, et al.
Publicado: (2026)
por: Huang, Shaofei, et al.
Publicado: (2026)
Ejemplares similares
-
Quantifying Loss Aversion in Cyber Adversaries via LLM Analysis
por: Hans, Soham, et al.
Publicado: (2025) -
TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering
por: Hossain, Saad, et al.
Publicado: (2026) -
Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks
por: You, Doohee
Publicado: (2026) -
Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts
por: Siddiky, Md Nurul Absar
Publicado: (2026) -
Robustness Analysis of Machine Learning Models for IoT Intrusion Detection Under Data Poisoning Attacks
por: Wulnye, Fortunatus Aabangbio, et al.
Publicado: (2026)