Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Cheung, Mark, Venkatesan, Sridhar
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Machine Learning Networking and Internet Architecture
Online-Zugang:	https://arxiv.org/abs/2504.11255
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866910911949701120
author	Cheung, Mark Venkatesan, Sridhar
author_facet	Cheung, Mark Venkatesan, Sridhar
contents	The ability to reconstruct fine-grained network session data, including individual packets, from coarse-grained feature vectors is crucial for improving network security models. However, the large-scale collection and storage of raw network traffic pose significant challenges, particularly for capturing rare cyberattack samples. These challenges hinder the ability to retain comprehensive datasets for model training and future threat detection. To address this, we propose a machine learning approach guided by formal methods to encode and reconstruct network data. Our method employs autoencoder models with domain-informed penalties to impute PCAP session headers from structured feature representations. Experimental results demonstrate that incorporating domain knowledge through constraint-based loss terms significantly improves reconstruction accuracy, particularly for categorical features with session-level encodings. By enabling efficient reconstruction of detailed network sessions, our approach facilitates data-efficient model training while preserving privacy and storage efficiency.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_11255
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Reconstructing Fine-Grained Network Data using Autoencoder Architectures with Domain Knowledge Penalties Cheung, Mark Venkatesan, Sridhar Machine Learning Networking and Internet Architecture The ability to reconstruct fine-grained network session data, including individual packets, from coarse-grained feature vectors is crucial for improving network security models. However, the large-scale collection and storage of raw network traffic pose significant challenges, particularly for capturing rare cyberattack samples. These challenges hinder the ability to retain comprehensive datasets for model training and future threat detection. To address this, we propose a machine learning approach guided by formal methods to encode and reconstruct network data. Our method employs autoencoder models with domain-informed penalties to impute PCAP session headers from structured feature representations. Experimental results demonstrate that incorporating domain knowledge through constraint-based loss terms significantly improves reconstruction accuracy, particularly for categorical features with session-level encodings. By enabling efficient reconstruction of detailed network sessions, our approach facilitates data-efficient model training while preserving privacy and storage efficiency.
title	Reconstructing Fine-Grained Network Data using Autoencoder Architectures with Domain Knowledge Penalties
topic	Machine Learning Networking and Internet Architecture
url	https://arxiv.org/abs/2504.11255

Ähnliche Einträge