Saved in:
| Main Authors: | Kuznetsov, Daniel, Cohen, Ofir, Shistik, Karin, Puzis, Rami, Shabtai, Asaf |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.04992 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LISAA: A Framework for Large Language Model Information Security Awareness Assessment
by: Cohen, Ofir, et al.
Published: (2024)
by: Cohen, Ofir, et al.
Published: (2024)
ConGISATA: A Framework for Continuous Gamified Information Security Awareness Training and Assessment
by: Cohen, Ofir, et al.
Published: (2026)
by: Cohen, Ofir, et al.
Published: (2026)
FAA Framework: A Large Language Model-Based Approach for Credit Card Fraud Investigations
by: Shuster, Shaun, et al.
Published: (2025)
by: Shuster, Shaun, et al.
Published: (2025)
ATAG: AI-Agent Application Threat Assessment with Attack Graphs
by: Gandhi, Parth Atulbhai, et al.
Published: (2025)
by: Gandhi, Parth Atulbhai, et al.
Published: (2025)
VAULT: Vigilant Adversarial Updates via LLM-Driven Retrieval-Augmented Generation for NLI
by: Kazoom, Roie, et al.
Published: (2025)
by: Kazoom, Roie, et al.
Published: (2025)
MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data
by: German, Eyal, et al.
Published: (2025)
by: German, Eyal, et al.
Published: (2025)
Mind the Web: The Security of Web Use Agents
by: Shapira, Avishag, et al.
Published: (2025)
by: Shapira, Avishag, et al.
Published: (2025)
DOMBA: Double Model Balancing for Access-Controlled Language Models via Minimum-Bounded Aggregation
by: Segal, Tom, et al.
Published: (2024)
by: Segal, Tom, et al.
Published: (2024)
From Tool Orchestration to Code Execution: A Study of MCP Design Choices
by: Felendler, Yuval, et al.
Published: (2026)
by: Felendler, Yuval, et al.
Published: (2026)
SCyTAG: Scalable Cyber-Twin for Threat-Assessment Based on Attack Graphs
by: Tayouri, David, et al.
Published: (2025)
by: Tayouri, David, et al.
Published: (2025)
GPT in Sheep's Clothing: The Risk of Customized GPTs
by: Antebi, Sagiv, et al.
Published: (2024)
by: Antebi, Sagiv, et al.
Published: (2024)
SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization
by: Meidan, Yair, et al.
Published: (2026)
by: Meidan, Yair, et al.
Published: (2026)
AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior
by: Abaev, Nadya, et al.
Published: (2026)
by: Abaev, Nadya, et al.
Published: (2026)
Measuring Safety Alignment Effects in Autonomous Security Agents
by: David, Isaac, et al.
Published: (2026)
by: David, Isaac, et al.
Published: (2026)
LumiMAS: A Comprehensive Framework for Real-Time Monitoring and Enhanced Observability in Multi-Agent Systems
by: Solomon, Ron, et al.
Published: (2025)
by: Solomon, Ron, et al.
Published: (2025)
Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
by: Ferrand, Jean-Charles Noirot, et al.
Published: (2025)
by: Ferrand, Jean-Charles Noirot, et al.
Published: (2025)
Reimagining Safety Alignment with An Image
by: Xia, Yifan, et al.
Published: (2025)
by: Xia, Yifan, et al.
Published: (2025)
Uncovering Logit Suppression Vulnerabilities in LLM Safety Alignment
by: Li, Yuxi, et al.
Published: (2024)
by: Li, Yuxi, et al.
Published: (2024)
Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment
by: Zaree, Pedram, et al.
Published: (2025)
by: Zaree, Pedram, et al.
Published: (2025)
Agent Safety Alignment via Reinforcement Learning
by: Sha, Zeyang, et al.
Published: (2025)
by: Sha, Zeyang, et al.
Published: (2025)
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
by: Cohen, Avihay
Published: (2025)
by: Cohen, Avihay
Published: (2025)
Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization
by: Li, Xurui, et al.
Published: (2025)
by: Li, Xurui, et al.
Published: (2025)
VisuoAlign: Safety Alignment of LVLMs with Multimodal Tree Search
by: Li, MingSheng, et al.
Published: (2025)
by: Li, MingSheng, et al.
Published: (2025)
LED there be DoS: Exploiting variable bitrate IP cameras for network DoS
by: Goldberg, Emmanuel, et al.
Published: (2025)
by: Goldberg, Emmanuel, et al.
Published: (2025)
LLM-Safety Evaluations Lack Robustness
by: Beyer, Tim, et al.
Published: (2025)
by: Beyer, Tim, et al.
Published: (2025)
Deep Learning Based XIoT Malware Analysis: A Comprehensive Survey, Taxonomy, and Research Challenges
by: Darwish, Rami, et al.
Published: (2024)
by: Darwish, Rami, et al.
Published: (2024)
Ablating Safety: Mechanisms for Removing Alignment in Language Models for Security Applications
by: David, Isaac, et al.
Published: (2026)
by: David, Isaac, et al.
Published: (2026)
Defensive Refusal Bias: How Safety Alignment Fails Cyber Defenders
by: Campbell, David, et al.
Published: (2026)
by: Campbell, David, et al.
Published: (2026)
Matching Ranks Over Probability Yields Truly Deep Safety Alignment
by: Vega, Jason, et al.
Published: (2025)
by: Vega, Jason, et al.
Published: (2025)
PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
by: Li, Nanxi, et al.
Published: (2025)
by: Li, Nanxi, et al.
Published: (2025)
Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents
by: Nawal, Aditya, et al.
Published: (2026)
by: Nawal, Aditya, et al.
Published: (2026)
aiXamine: Simplified LLM Safety and Security
by: Deniz, Fatih, et al.
Published: (2025)
by: Deniz, Fatih, et al.
Published: (2025)
What Matters For Safety Alignment?
by: Li, Xing, et al.
Published: (2026)
by: Li, Xing, et al.
Published: (2026)
SafeThinker: Reasoning about Risk to Deepen Safety Beyond Shallow Alignment
by: Fang, Xianya, et al.
Published: (2026)
by: Fang, Xianya, et al.
Published: (2026)
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
by: Yin, Qingyu, et al.
Published: (2025)
by: Yin, Qingyu, et al.
Published: (2025)
Mitigating the Safety-utility Trade-off in LLM Alignment via Adaptive Safe Context Learning
by: Wang, Yanbo, et al.
Published: (2026)
by: Wang, Yanbo, et al.
Published: (2026)
SAGE: A Generic Framework for LLM Safety Evaluation
by: Jindal, Madhur, et al.
Published: (2025)
by: Jindal, Madhur, et al.
Published: (2025)
Safety Alignment Should Be Made More Than Just a Few Tokens Deep
by: Qi, Xiangyu, et al.
Published: (2024)
by: Qi, Xiangyu, et al.
Published: (2024)
MTSA: Multi-turn Safety Alignment for LLMs through Multi-round Red-teaming
by: Guo, Weiyang, et al.
Published: (2025)
by: Guo, Weiyang, et al.
Published: (2025)
FedP3E: Privacy-Preserving Prototype Exchange for Non-IID IoT Malware Detection in Cross-Silo Federated Learning
by: Darwish, Rami, et al.
Published: (2025)
by: Darwish, Rami, et al.
Published: (2025)
Similar Items
-
LISAA: A Framework for Large Language Model Information Security Awareness Assessment
by: Cohen, Ofir, et al.
Published: (2024) -
ConGISATA: A Framework for Continuous Gamified Information Security Awareness Training and Assessment
by: Cohen, Ofir, et al.
Published: (2026) -
FAA Framework: A Large Language Model-Based Approach for Credit Card Fraud Investigations
by: Shuster, Shaun, et al.
Published: (2025) -
ATAG: AI-Agent Application Threat Assessment with Attack Graphs
by: Gandhi, Parth Atulbhai, et al.
Published: (2025) -
VAULT: Vigilant Adversarial Updates via LLM-Driven Retrieval-Augmented Generation for NLI
by: Kazoom, Roie, et al.
Published: (2025)