Saved in:
| Main Authors: | Salerno, Fabio, Al-Kaswan, Ali, Izadi, Maliheh |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.17501 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Traces of Memorisation in Large Language Models for Code
by: Al-Kaswan, Ali, et al.
Published: (2023)
by: Al-Kaswan, Ali, et al.
Published: (2023)
Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture the Flag Challenges
by: Al-Kaswan, Ali, et al.
Published: (2026)
by: Al-Kaswan, Ali, et al.
Published: (2026)
Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data
by: Zhang, Jiale, et al.
Published: (2025)
by: Zhang, Jiale, et al.
Published: (2025)
Black-box Membership Inference Attacks against Fine-tuned Diffusion Models
by: Pang, Yan, et al.
Published: (2023)
by: Pang, Yan, et al.
Published: (2023)
SoK: Reducing the Vulnerability of Fine-tuned Language Models to Membership Inference Attacks
by: Amit, Guy, et al.
Published: (2024)
by: Amit, Guy, et al.
Published: (2024)
On the Reliability of Biometric Datasets: How Much Test Data Ensures Reliability?
by: Fallahi, Matin, et al.
Published: (2025)
by: Fallahi, Matin, et al.
Published: (2025)
On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models
by: Sahili, Ali Al, et al.
Published: (2025)
by: Sahili, Ali Al, et al.
Published: (2025)
External Data Extraction Attacks against Retrieval-Augmented Large Language Models
by: He, Yu, et al.
Published: (2025)
by: He, Yu, et al.
Published: (2025)
Do Multimodal RAG Systems Leak Data? A Comprehensive Evaluation of Membership Inference and Image Caption Retrieval Attacks
by: Al-Lawati, Ali, et al.
Published: (2026)
by: Al-Lawati, Ali, et al.
Published: (2026)
How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation under the One-Time-Pad-Based Framework
by: Liang, Zi, et al.
Published: (2025)
by: Liang, Zi, et al.
Published: (2025)
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
by: Huang, Tiansheng, et al.
Published: (2024)
by: Huang, Tiansheng, et al.
Published: (2024)
StrTune: Data Dependence-based Code Slicing for Binary Similarity Detection with Fine-tuned Representation
by: He, Kaiyan, et al.
Published: (2024)
by: He, Kaiyan, et al.
Published: (2024)
Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
by: Xie, Zhixin, et al.
Published: (2025)
by: Xie, Zhixin, et al.
Published: (2025)
Fine-tuning Large Language Models for DGA and DNS Exfiltration Detection
by: Sayed, Md Abu, et al.
Published: (2024)
by: Sayed, Md Abu, et al.
Published: (2024)
System Prompt Extraction Attacks and Defenses in Large Language Models
by: Das, Badhan Chandra, et al.
Published: (2025)
by: Das, Badhan Chandra, et al.
Published: (2025)
Window-based Membership Inference Attacks Against Fine-tuned Large Language Models
by: Chen, Yuetian, et al.
Published: (2026)
by: Chen, Yuetian, et al.
Published: (2026)
Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code
by: Bappy, Md. Azizul Hakim, et al.
Published: (2025)
by: Bappy, Md. Azizul Hakim, et al.
Published: (2025)
Detecting Instruction Fine-tuning Attacks using Influence Function
by: Li, Jiawei
Published: (2025)
by: Li, Jiawei
Published: (2025)
SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks
by: Zhang, Kaiyuan, et al.
Published: (2025)
by: Zhang, Kaiyuan, et al.
Published: (2025)
Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data
by: Akkus, Atilla, et al.
Published: (2024)
by: Akkus, Atilla, et al.
Published: (2024)
No Two Devils Alike: Unveiling Distinct Mechanisms of Fine-tuning Attacks
by: Leong, Chak Tou, et al.
Published: (2024)
by: Leong, Chak Tou, et al.
Published: (2024)
Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues
by: Cipollone, Daniele, et al.
Published: (2025)
by: Cipollone, Daniele, et al.
Published: (2025)
LoRA-Leak: Membership Inference Attacks Against LoRA Fine-tuned Language Models
by: Ran, Delong, et al.
Published: (2025)
by: Ran, Delong, et al.
Published: (2025)
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning
by: Huang, Tiansheng, et al.
Published: (2024)
by: Huang, Tiansheng, et al.
Published: (2024)
Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
by: Wang, Jiongxiao, et al.
Published: (2024)
by: Wang, Jiongxiao, et al.
Published: (2024)
Navigating the Designs of Privacy-Preserving Fine-tuning for Large Language Models
by: Shi, Haonan, et al.
Published: (2025)
by: Shi, Haonan, et al.
Published: (2025)
Fine-tuning of Large Language Models for Domain-Specific Cybersecurity Knowledge
by: Huang, Yuan
Published: (2025)
by: Huang, Yuan
Published: (2025)
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation
by: Huang, Tiansheng, et al.
Published: (2025)
by: Huang, Tiansheng, et al.
Published: (2025)
Practical Secure Inference Algorithm for Fine-tuned Large Language Model Based on Fully Homomorphic Encryption
by: Ruoyan, Zhang, et al.
Published: (2025)
by: Ruoyan, Zhang, et al.
Published: (2025)
SCOUT: A Defense Against Data Poisoning Attacks in Fine-Tuned Language Models
by: Afane, Mohamed, et al.
Published: (2025)
by: Afane, Mohamed, et al.
Published: (2025)
Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration
by: Fu, Wenjie, et al.
Published: (2023)
by: Fu, Wenjie, et al.
Published: (2023)
Model Extraction Attacks Revisited
by: Liang, Jiacheng, et al.
Published: (2023)
by: Liang, Jiacheng, et al.
Published: (2023)
Differentially Private Parameter-Efficient Fine-tuning for Large ASR Models
by: Liu, Hongbin, et al.
Published: (2024)
by: Liu, Hongbin, et al.
Published: (2024)
PriFFT: Privacy-preserving Federated Fine-tuning of Large Language Models via Hybrid Secret Sharing
by: You, Zhichao, et al.
Published: (2025)
by: You, Zhichao, et al.
Published: (2025)
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack
by: Pang, Kaiyi, et al.
Published: (2024)
by: Pang, Kaiyi, et al.
Published: (2024)
An Automated Attack Investigation Approach Leveraging Threat-Knowledge-Augmented Large Language Models
by: Dai, Rujie, et al.
Published: (2025)
by: Dai, Rujie, et al.
Published: (2025)
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
by: Yang, Kang, et al.
Published: (2025)
by: Yang, Kang, et al.
Published: (2025)
Large Language Models for Code Analysis: Do LLMs Really Do Their Job?
by: Fang, Chongzhou, et al.
Published: (2023)
by: Fang, Chongzhou, et al.
Published: (2023)
EnchTable: Unified Safety Alignment Transfer in Fine-tuned Large Language Models
by: Wu, Jialin, et al.
Published: (2025)
by: Wu, Jialin, et al.
Published: (2025)
Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning
by: Liu, Guozhi, et al.
Published: (2025)
by: Liu, Guozhi, et al.
Published: (2025)
Similar Items
-
Traces of Memorisation in Large Language Models for Code
by: Al-Kaswan, Ali, et al.
Published: (2023) -
Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture the Flag Challenges
by: Al-Kaswan, Ali, et al.
Published: (2026) -
Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data
by: Zhang, Jiale, et al.
Published: (2025) -
Black-box Membership Inference Attacks against Fine-tuned Diffusion Models
by: Pang, Yan, et al.
Published: (2023) -
SoK: Reducing the Vulnerability of Fine-tuned Language Models to Membership Inference Attacks
by: Amit, Guy, et al.
Published: (2024)