Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Nizon-Deladoeuille, Martin, Stefánsson, Brynjólfur, Neukirchen, Helmut, Welsh, Thomas
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Cryptography and Security
Online-Zugang:	https://arxiv.org/abs/2501.17539
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866911546963132416
author	Nizon-Deladoeuille, Martin Stefánsson, Brynjólfur Neukirchen, Helmut Welsh, Thomas
author_facet	Nizon-Deladoeuille, Martin Stefánsson, Brynjólfur Neukirchen, Helmut Welsh, Thomas
contents	Cybersecurity education is challenging and it is helpful for educators to understand Large Language Models' (LLMs') capabilities for supporting education. This study evaluates the effectiveness of LLMs in conducting a variety of penetration testing tasks. Fifteen representative tasks were selected to cover a comprehensive range of real-world scenarios. We evaluate the performance of 6 models (GPT-4o mini, GPT-4o, Gemini 1.5 Flash, Llama 3.1 405B, Mixtral 8x7B and WhiteRabbitNeo) upon the Metasploitable v3 Ubuntu image and OWASP WebGOAT. Our findings suggest that GPT-4o mini currently offers the most consistent support making it a valuable tool for educational purposes. However, its use in conjonction with WhiteRabbitNeo should be considered, because of its innovative approach to tool and command recommendations. This study underscores the need for continued research into optimising LLMs for complex, domain-specific tasks in cybersecurity education.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_17539
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Towards Supporting Penetration Testing Education with Large Language Models: an Evaluation and Comparison Nizon-Deladoeuille, Martin Stefánsson, Brynjólfur Neukirchen, Helmut Welsh, Thomas Cryptography and Security Cybersecurity education is challenging and it is helpful for educators to understand Large Language Models' (LLMs') capabilities for supporting education. This study evaluates the effectiveness of LLMs in conducting a variety of penetration testing tasks. Fifteen representative tasks were selected to cover a comprehensive range of real-world scenarios. We evaluate the performance of 6 models (GPT-4o mini, GPT-4o, Gemini 1.5 Flash, Llama 3.1 405B, Mixtral 8x7B and WhiteRabbitNeo) upon the Metasploitable v3 Ubuntu image and OWASP WebGOAT. Our findings suggest that GPT-4o mini currently offers the most consistent support making it a valuable tool for educational purposes. However, its use in conjonction with WhiteRabbitNeo should be considered, because of its innovative approach to tool and command recommendations. This study underscores the need for continued research into optimising LLMs for complex, domain-specific tasks in cybersecurity education.
title	Towards Supporting Penetration Testing Education with Large Language Models: an Evaluation and Comparison
topic	Cryptography and Security
url	https://arxiv.org/abs/2501.17539

Ähnliche Einträge