Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lee, Youngjoon, Park, Taehyun, Lee, Yunho, Gong, Jinu, Kang, Joonhyuk
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2501.18416
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866917455199207424
author Lee, Youngjoon
Park, Taehyun
Lee, Yunho
Gong, Jinu
Kang, Joonhyuk
author_facet Lee, Youngjoon
Park, Taehyun
Lee, Yunho
Gong, Jinu
Kang, Joonhyuk
contents Federated Learning (FL) is increasingly being adopted in military collaborations to develop Large Language Models (LLMs) while preserving data sovereignty. However, prompt injection attacks-malicious manipulations of input prompts-pose new threats that may undermine operational security, disrupt decision-making, and erode trust among allies. This perspective paper highlights four vulnerabilities in federated military LLMs: secret data leakage, free-rider exploitation, system disruption, and misinformation spread. To address these risks, we propose a human-AI collaborative framework with both technical and policy countermeasures. On the technical side, our framework uses red/blue team wargaming and quality assurance to detect and mitigate adversarial behaviors of shared LLM weights. On the policy side, it promotes joint AI-human policy development and verification of security protocols.
format Preprint
id arxiv_https___arxiv_org_abs_2501_18416
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Exploring Potential Prompt Injection Attacks in Federated Military LLMs and Their Mitigation
Lee, Youngjoon
Park, Taehyun
Lee, Yunho
Gong, Jinu
Kang, Joonhyuk
Machine Learning
Federated Learning (FL) is increasingly being adopted in military collaborations to develop Large Language Models (LLMs) while preserving data sovereignty. However, prompt injection attacks-malicious manipulations of input prompts-pose new threats that may undermine operational security, disrupt decision-making, and erode trust among allies. This perspective paper highlights four vulnerabilities in federated military LLMs: secret data leakage, free-rider exploitation, system disruption, and misinformation spread. To address these risks, we propose a human-AI collaborative framework with both technical and policy countermeasures. On the technical side, our framework uses red/blue team wargaming and quality assurance to detect and mitigate adversarial behaviors of shared LLM weights. On the policy side, it promotes joint AI-human policy development and verification of security protocols.
title Exploring Potential Prompt Injection Attacks in Federated Military LLMs and Their Mitigation
topic Machine Learning
url https://arxiv.org/abs/2501.18416