Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Pekar, Adrian, Jozsa, Richard
Formato:	Preprint
Publicado:	2024
Materias:	Machine Learning Cryptography and Security
Acceso en línea:	https://arxiv.org/abs/2407.02856
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866909665981366272
author	Pekar, Adrian Jozsa, Richard
author_facet	Pekar, Adrian Jozsa, Richard
contents	This study investigates the efficacy of machine learning models in network security threat detection through the critical lens of partial versus complete flow information, addressing a common gap between research settings and real-time operational needs. We systematically evaluate how a standard benchmark model, Random Forest, performs under varying training and testing conditions (complete/complete, partial/partial, complete/partial), quantifying the performance impact when dealing with the incomplete data typical in real-time environments. Our findings demonstrate a significant performance difference, with precision and recall dropping by up to 30% under certain conditions when models trained on complete flows are tested against partial flows. The study also reveals that, for the evaluated dataset and model, a minimum threshold around 7 packets in the test set appears necessary for maintaining reliable detection rates, providing valuable, quantified insights for developing more realistic real-time detection strategies.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_02856
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows Pekar, Adrian Jozsa, Richard Machine Learning Cryptography and Security This study investigates the efficacy of machine learning models in network security threat detection through the critical lens of partial versus complete flow information, addressing a common gap between research settings and real-time operational needs. We systematically evaluate how a standard benchmark model, Random Forest, performs under varying training and testing conditions (complete/complete, partial/partial, complete/partial), quantifying the performance impact when dealing with the incomplete data typical in real-time environments. Our findings demonstrate a significant performance difference, with precision and recall dropping by up to 30% under certain conditions when models trained on complete flows are tested against partial flows. The study also reveals that, for the evaluated dataset and model, a minimum threshold around 7 packets in the test set appears necessary for maintaining reliable detection rates, providing valuable, quantified insights for developing more realistic real-time detection strategies.
title	Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows
topic	Machine Learning Cryptography and Security
url	https://arxiv.org/abs/2407.02856

Ejemplares similares