:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kuznetsov, Daniel, Cohen, Ofir, Shistik, Karin, Puzis, Rami, Shabtai, Asaf
Format:	Preprint
Published:	2026
Subjects:	Cryptography and Security Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.04992
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LISAA: A Framework for Large Language Model Information Security Awareness Assessment
by: Cohen, Ofir, et al.
Published: (2024)

ConGISATA: A Framework for Continuous Gamified Information Security Awareness Training and Assessment
by: Cohen, Ofir, et al.
Published: (2026)

FAA Framework: A Large Language Model-Based Approach for Credit Card Fraud Investigations
by: Shuster, Shaun, et al.
Published: (2025)

ATAG: AI-Agent Application Threat Assessment with Attack Graphs
by: Gandhi, Parth Atulbhai, et al.
Published: (2025)

VAULT: Vigilant Adversarial Updates via LLM-Driven Retrieval-Augmented Generation for NLI
by: Kazoom, Roie, et al.
Published: (2025)

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data
by: German, Eyal, et al.
Published: (2025)

Mind the Web: The Security of Web Use Agents
by: Shapira, Avishag, et al.
Published: (2025)

DOMBA: Double Model Balancing for Access-Controlled Language Models via Minimum-Bounded Aggregation
by: Segal, Tom, et al.
Published: (2024)

From Tool Orchestration to Code Execution: A Study of MCP Design Choices
by: Felendler, Yuval, et al.
Published: (2026)

SCyTAG: Scalable Cyber-Twin for Threat-Assessment Based on Attack Graphs
by: Tayouri, David, et al.
Published: (2025)

GPT in Sheep's Clothing: The Risk of Customized GPTs
by: Antebi, Sagiv, et al.
Published: (2024)

SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization
by: Meidan, Yair, et al.
Published: (2026)

AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior
by: Abaev, Nadya, et al.
Published: (2026)

Measuring Safety Alignment Effects in Autonomous Security Agents
by: David, Isaac, et al.
Published: (2026)

LumiMAS: A Comprehensive Framework for Real-Time Monitoring and Enhanced Observability in Multi-Agent Systems
by: Solomon, Ron, et al.
Published: (2025)

Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
by: Ferrand, Jean-Charles Noirot, et al.
Published: (2025)

Reimagining Safety Alignment with An Image
by: Xia, Yifan, et al.
Published: (2025)

Uncovering Logit Suppression Vulnerabilities in LLM Safety Alignment
by: Li, Yuxi, et al.
Published: (2024)

Attention Eclipse: Manipulating Attention to Bypass LLM Safety-Alignment
by: Zaree, Pedram, et al.
Published: (2025)

Agent Safety Alignment via Reinforcement Learning
by: Sha, Zeyang, et al.
Published: (2025)

In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
by: Cohen, Avihay
Published: (2025)

Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization
by: Li, Xurui, et al.
Published: (2025)

VisuoAlign: Safety Alignment of LVLMs with Multimodal Tree Search
by: Li, MingSheng, et al.
Published: (2025)

LED there be DoS: Exploiting variable bitrate IP cameras for network DoS
by: Goldberg, Emmanuel, et al.
Published: (2025)

LLM-Safety Evaluations Lack Robustness
by: Beyer, Tim, et al.
Published: (2025)

Deep Learning Based XIoT Malware Analysis: A Comprehensive Survey, Taxonomy, and Research Challenges
by: Darwish, Rami, et al.
Published: (2024)

Ablating Safety: Mechanisms for Removing Alignment in Language Models for Security Applications
by: David, Isaac, et al.
Published: (2026)

Defensive Refusal Bias: How Safety Alignment Fails Cyber Defenders
by: Campbell, David, et al.
Published: (2026)

Matching Ranks Over Probability Yields Truly Deep Safety Alignment
by: Vega, Jason, et al.
Published: (2025)

PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
by: Li, Nanxi, et al.
Published: (2025)

Relevance as a Vulnerability: How Web Retrieval Degrades Safety Alignment in LLM Agents
by: Nawal, Aditya, et al.
Published: (2026)

aiXamine: Simplified LLM Safety and Security
by: Deniz, Fatih, et al.
Published: (2025)

What Matters For Safety Alignment?
by: Li, Xing, et al.
Published: (2026)

SafeThinker: Reasoning about Risk to Deepen Safety Beyond Shallow Alignment
by: Fang, Xianya, et al.
Published: (2026)

Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
by: Yin, Qingyu, et al.
Published: (2025)

Mitigating the Safety-utility Trade-off in LLM Alignment via Adaptive Safe Context Learning
by: Wang, Yanbo, et al.
Published: (2026)

SAGE: A Generic Framework for LLM Safety Evaluation
by: Jindal, Madhur, et al.
Published: (2025)

Safety Alignment Should Be Made More Than Just a Few Tokens Deep
by: Qi, Xiangyu, et al.
Published: (2024)

MTSA: Multi-turn Safety Alignment for LLMs through Multi-round Red-teaming
by: Guo, Weiyang, et al.
Published: (2025)

FedP3E: Privacy-Preserving Prototype Exchange for Non-IID IoT Malware Detection in Cross-Silo Federated Learning
by: Darwish, Rami, et al.
Published: (2025)