Saved in:
| Main Author: | de Gregorio, Alfonso |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.17109 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mitigating Jailbreaks with Intent-Aware LLMs
by: Yeo, Wei Jie, et al.
Published: (2025)
by: Yeo, Wei Jie, et al.
Published: (2025)
SoK: Privacy Risks and Mitigations in Retrieval-Augmented Generation Systems
by: Bodea, Andreea-Elena, et al.
Published: (2026)
by: Bodea, Andreea-Elena, et al.
Published: (2026)
A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy
by: Wang, Huandong, et al.
Published: (2025)
by: Wang, Huandong, et al.
Published: (2025)
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
by: Mathew, Yohan, et al.
Published: (2024)
by: Mathew, Yohan, et al.
Published: (2024)
The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation
by: Xiong, Alexander, et al.
Published: (2025)
by: Xiong, Alexander, et al.
Published: (2025)
The Ethics of Interaction: Mitigating Security Threats in LLMs
by: Kumar, Ashutosh, et al.
Published: (2024)
by: Kumar, Ashutosh, et al.
Published: (2024)
Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs
by: Wang, Jiawen, et al.
Published: (2025)
by: Wang, Jiawen, et al.
Published: (2025)
Preventing Jailbreak Prompts as Malicious Tools for Cybercriminals: A Cyber Defense Perspective
by: Tshimula, Jean Marie, et al.
Published: (2024)
by: Tshimula, Jean Marie, et al.
Published: (2024)
Building Resilient SMEs: Harnessing Large Language Models for Cyber Security in Australia
by: Kereopa-Yorke, Benjamin
Published: (2023)
by: Kereopa-Yorke, Benjamin
Published: (2023)
The Incoherency Risk in the EU's New Cyber Security Policies
by: Ruohonen, Jukka
Published: (2024)
by: Ruohonen, Jukka
Published: (2024)
LLM Cyber Evaluations Don't Capture Real-World Risk
by: Lukošiūtė, Kamilė, et al.
Published: (2025)
by: Lukošiūtė, Kamilė, et al.
Published: (2025)
From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions
by: Peng, Jiaren, et al.
Published: (2025)
by: Peng, Jiaren, et al.
Published: (2025)
CyberThreat-Eval: Can Large Language Models Automate Real-World Threat Research?
by: Chen, Xiangsen, et al.
Published: (2026)
by: Chen, Xiangsen, et al.
Published: (2026)
Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs
by: Joshi, Kunj, et al.
Published: (2025)
by: Joshi, Kunj, et al.
Published: (2025)
Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference
by: Luo, Zhifan, et al.
Published: (2025)
by: Luo, Zhifan, et al.
Published: (2025)
PolicyLR: A Logic Representation For Privacy Policies
by: Hooda, Ashish, et al.
Published: (2024)
by: Hooda, Ashish, et al.
Published: (2024)
You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors
by: Cao, Bochuan, et al.
Published: (2025)
by: Cao, Bochuan, et al.
Published: (2025)
LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification
by: Tang, Fengxiao, et al.
Published: (2025)
by: Tang, Fengxiao, et al.
Published: (2025)
Cyber-Zero: Training Cybersecurity Agents without Runtime
by: Zhuo, Terry Yue, et al.
Published: (2025)
by: Zhuo, Terry Yue, et al.
Published: (2025)
The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration
by: Patil, Vaidehi, et al.
Published: (2025)
by: Patil, Vaidehi, et al.
Published: (2025)
Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness
by: Shafee, Samaneh, et al.
Published: (2024)
by: Shafee, Samaneh, et al.
Published: (2024)
GuidedBench: Measuring and Mitigating the Evaluation Discrepancies of In-the-wild LLM Jailbreak Methods
by: Huang, Ruixuan, et al.
Published: (2025)
by: Huang, Ruixuan, et al.
Published: (2025)
MBTSAD: Mitigating Backdoors in Language Models Based on Token Splitting and Attention Distillation
by: Ding, Yidong, et al.
Published: (2025)
by: Ding, Yidong, et al.
Published: (2025)
Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation
by: Zhang, Junbo, et al.
Published: (2025)
by: Zhang, Junbo, et al.
Published: (2025)
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
by: Wang, Jiongxiao, et al.
Published: (2024)
by: Wang, Jiongxiao, et al.
Published: (2024)
Good Parenting is all you need -- Multi-agentic LLM Hallucination Mitigation
by: Kwartler, Ted, et al.
Published: (2024)
by: Kwartler, Ted, et al.
Published: (2024)
Compiling Activation Steering into Weights via Null-Space Constraints for Stealthy Backdoors
by: Yin, Rui, et al.
Published: (2026)
by: Yin, Rui, et al.
Published: (2026)
Fingerprinting LLMs via Prompt Injection
by: Hu, Yuepeng, et al.
Published: (2025)
by: Hu, Yuepeng, et al.
Published: (2025)
LLMs for Domain Generation Algorithm Detection
by: La O, Reynier Leyva, et al.
Published: (2024)
by: La O, Reynier Leyva, et al.
Published: (2024)
Overriding Safety protections of Open-source Models
by: Kumar, Sachin
Published: (2024)
by: Kumar, Sachin
Published: (2024)
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks
by: He, Xuanli, et al.
Published: (2024)
by: He, Xuanli, et al.
Published: (2024)
WaterPool: A Watermark Mitigating Trade-offs among Imperceptibility, Efficacy and Robustness
by: Huang, Baizhou, et al.
Published: (2024)
by: Huang, Baizhou, et al.
Published: (2024)
Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments
by: Rigaki, Maria, et al.
Published: (2023)
by: Rigaki, Maria, et al.
Published: (2023)
Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks
by: Struppek, Lukas, et al.
Published: (2026)
by: Struppek, Lukas, et al.
Published: (2026)
SoK: Are Watermarks in LLMs Ready for Deployment?
by: Dang, Kieu, et al.
Published: (2025)
by: Dang, Kieu, et al.
Published: (2025)
PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing
by: Hughes, Anthony, et al.
Published: (2025)
by: Hughes, Anthony, et al.
Published: (2025)
Do Prompts Guarantee Safety? Mitigating Toxicity from LLM Generations through Subspace Intervention
by: Singh, Himanshu, et al.
Published: (2026)
by: Singh, Himanshu, et al.
Published: (2026)
Towards Principled Analysis and Mitigation of Space Cyber Risks
by: Ear, Ekzhin
Published: (2025)
by: Ear, Ekzhin
Published: (2025)
ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation
by: Wu, Yiran, et al.
Published: (2025)
by: Wu, Yiran, et al.
Published: (2025)
Actionable Cyber Threat Intelligence using Knowledge Graphs and Large Language Models
by: Fieblinger, Romy, et al.
Published: (2024)
by: Fieblinger, Romy, et al.
Published: (2024)
Similar Items
-
Mitigating Jailbreaks with Intent-Aware LLMs
by: Yeo, Wei Jie, et al.
Published: (2025) -
SoK: Privacy Risks and Mitigations in Retrieval-Augmented Generation Systems
by: Bodea, Andreea-Elena, et al.
Published: (2026) -
A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy
by: Wang, Huandong, et al.
Published: (2025) -
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
by: Mathew, Yohan, et al.
Published: (2024) -
The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation
by: Xiong, Alexander, et al.
Published: (2025)