Saved in:
| Main Author: | Malmqvist, Lars |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.07846 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Win-k: Improved Membership Inference Attacks on Small Language Models
by: Arkhmammadova, Roya, et al.
Published: (2025)
by: Arkhmammadova, Roya, et al.
Published: (2025)
Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models
by: Panpatil, Siddhant, et al.
Published: (2025)
by: Panpatil, Siddhant, et al.
Published: (2025)
Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs
by: Goldenits, Georg, et al.
Published: (2025)
by: Goldenits, Georg, et al.
Published: (2025)
A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors
by: Xu, Zhenyu, et al.
Published: (2026)
by: Xu, Zhenyu, et al.
Published: (2026)
Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments
by: Rigaki, Maria, et al.
Published: (2023)
by: Rigaki, Maria, et al.
Published: (2023)
Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design
by: Ben, Dong, et al.
Published: (2025)
by: Ben, Dong, et al.
Published: (2025)
BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models
by: Yan, Xiaobei, et al.
Published: (2025)
by: Yan, Xiaobei, et al.
Published: (2025)
SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents
by: Qu, Yubin, et al.
Published: (2026)
by: Qu, Yubin, et al.
Published: (2026)
Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models
by: Caballero, Mario Marín, et al.
Published: (2026)
by: Caballero, Mario Marín, et al.
Published: (2026)
User Behavior Analysis in Privacy Protection with Large Language Models: A Study on Privacy Preferences with Limited Data
by: Yang, Haowei, et al.
Published: (2025)
by: Yang, Haowei, et al.
Published: (2025)
Eliciting Least-to-Most Reasoning for Phishing URL Detection
by: Trikilis, Holly, et al.
Published: (2026)
by: Trikilis, Holly, et al.
Published: (2026)
Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models
by: Raza, Ali, et al.
Published: (2026)
by: Raza, Ali, et al.
Published: (2026)
Can Small Language Models Reliably Resist Jailbreak Attacks? A Comprehensive Evaluation
by: Zhang, Wenhui, et al.
Published: (2025)
by: Zhang, Wenhui, et al.
Published: (2025)
Towards Small Language Models for Security Query Generation in SOC Workflows
by: Muzammil, Saleha, et al.
Published: (2025)
by: Muzammil, Saleha, et al.
Published: (2025)
A Survey of Attacks on Large Language Models
by: Xu, Wenrui, et al.
Published: (2025)
by: Xu, Wenrui, et al.
Published: (2025)
A Survey of Large Language Models in Cybersecurity
by: da Silva, Gabriel de Jesus Coelho, et al.
Published: (2024)
by: da Silva, Gabriel de Jesus Coelho, et al.
Published: (2024)
Safeguarding Large Language Models: A Survey
by: Dong, Yi, et al.
Published: (2024)
by: Dong, Yi, et al.
Published: (2024)
When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents
by: Jones, Jaylen, et al.
Published: (2026)
by: Jones, Jaylen, et al.
Published: (2026)
A Cross-Language Investigation into Jailbreak Attacks in Large Language Models
by: Li, Jie, et al.
Published: (2024)
by: Li, Jie, et al.
Published: (2024)
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
by: Nakka, Kalyan, et al.
Published: (2024)
by: Nakka, Kalyan, et al.
Published: (2024)
Security Concerns for Large Language Models: A Survey
by: Li, Miles Q., et al.
Published: (2025)
by: Li, Miles Q., et al.
Published: (2025)
A Survey on Data Security in Large Language Models
by: Chen, Kang, et al.
Published: (2025)
by: Chen, Kang, et al.
Published: (2025)
Watermarking Techniques for Large Language Models: A Survey
by: Liang, Yuqing, et al.
Published: (2024)
by: Liang, Yuqing, et al.
Published: (2024)
(Security) Assertions by Large Language Models
by: Kande, Rahul, et al.
Published: (2023)
by: Kande, Rahul, et al.
Published: (2023)
LLMmap: Fingerprinting For Large Language Models
by: Pasquini, Dario, et al.
Published: (2024)
by: Pasquini, Dario, et al.
Published: (2024)
Backdooring Bias in Large Language Models
by: Das, Anudeep, et al.
Published: (2026)
by: Das, Anudeep, et al.
Published: (2026)
Towards Privacy-Preserving and Personalized Smart Homes via Tailored Small Language Models
by: Huang, Xinyu, et al.
Published: (2025)
by: Huang, Xinyu, et al.
Published: (2025)
Fine-Tuning Small Language Models for Solution-Oriented Windows Event Log Analysis
by: Akhtar, Siraaj, et al.
Published: (2026)
by: Akhtar, Siraaj, et al.
Published: (2026)
BEACON: Behavioral Malware Classification with Large Language Model Embeddings and Deep Learning
by: Perera, Wadduwage Shanika, et al.
Published: (2025)
by: Perera, Wadduwage Shanika, et al.
Published: (2025)
BEACON: A Unified Behavioral-Tactical Framework for Explainable Cybercrime Analysis with Large Language Models
by: Sachdeva, Arush, et al.
Published: (2025)
by: Sachdeva, Arush, et al.
Published: (2025)
Digital Forensics in the Age of Large Language Models
by: Yin, Zhipeng, et al.
Published: (2025)
by: Yin, Zhipeng, et al.
Published: (2025)
Not My Agent, Not My Boundary? Elicitation of Personal Privacy Boundaries in AI-Delegated Information Sharing
by: Guo, Bingcan, et al.
Published: (2025)
by: Guo, Bingcan, et al.
Published: (2025)
Emerging Security Challenges of Large Language Models
by: Debar, Herve, et al.
Published: (2024)
by: Debar, Herve, et al.
Published: (2024)
Emoji-Based Jailbreaking of Large Language Models
by: Gopinadh, M P V S, et al.
Published: (2026)
by: Gopinadh, M P V S, et al.
Published: (2026)
Functional Subspace Watermarking for Large Language Models
by: Ding, Zikang, et al.
Published: (2026)
by: Ding, Zikang, et al.
Published: (2026)
GUARD-SLM: Token Activation-Based Defense Against Jailbreak Attacks for Small Language Models
by: Mia, Md Jueal, et al.
Published: (2026)
by: Mia, Md Jueal, et al.
Published: (2026)
GhostCite: A Large-Scale Analysis of Citation Validity in the Age of Large Language Models
by: Xu, Zuyao, et al.
Published: (2026)
by: Xu, Zuyao, et al.
Published: (2026)
Large Language Models for Security Operations Centers: A Comprehensive Survey
by: Habibzadeh, Ali, et al.
Published: (2025)
by: Habibzadeh, Ali, et al.
Published: (2025)
CEFW: A Comprehensive Evaluation Framework for Watermark in Large Language Models
by: Zhang, Shuhao, et al.
Published: (2025)
by: Zhang, Shuhao, et al.
Published: (2025)
A Survey: Towards Privacy and Security in Mobile Large Language Models
by: Xu, Honghui, et al.
Published: (2025)
by: Xu, Honghui, et al.
Published: (2025)
Similar Items
-
Win-k: Improved Membership Inference Attacks on Small Language Models
by: Arkhmammadova, Roya, et al.
Published: (2025) -
Eliciting and Analyzing Emergent Misalignment in State-of-the-Art Large Language Models
by: Panpatil, Siddhant, et al.
Published: (2025) -
Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs
by: Goldenits, Georg, et al.
Published: (2025) -
A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors
by: Xu, Zhenyu, et al.
Published: (2026) -
Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments
by: Rigaki, Maria, et al.
Published: (2023)