Salvato in:
Dettagli Bibliografici
Autori principali: Pulido, José, Wilhelmi, Francesc, Fortes, Sergio, Fernández-Durán, Alfonso, Giordano, Lorenzo Galati, Barco, Raquel
Natura: Preprint
Pubblicazione: 2026
Soggetti:
Accesso online:https://arxiv.org/abs/2601.07646
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866910059870552064
author Pulido, José
Wilhelmi, Francesc
Fortes, Sergio
Fernández-Durán, Alfonso
Giordano, Lorenzo Galati
Barco, Raquel
author_facet Pulido, José
Wilhelmi, Francesc
Fortes, Sergio
Fernández-Durán, Alfonso
Giordano, Lorenzo Galati
Barco, Raquel
contents Synthetic data generation is an appealing tool for augmenting and enriching datasets, playing a crucial role in advancing artificial intelligence (AI) and machine learning (ML). Not only does synthetic data help build robust AI/ML datasets cost-effectively, but it also offers privacy-friendly solutions and bypasses the complexities of storing large data volumes. This paper proposes a novel method to generate synthetic data, based on first-order auto-regressive noise statistics, for large-scale Wi-Fi deployments. The approach operates with minimal real data requirements while producing statistically rich traffic patterns that effectively mimic real Access Point (AP) behavior. Experimental results show that ML models trained on synthetic data achieve Mean Absolute Error (MAE) values within 10 to 15 of those obtained using real data when trained on the same APs, while requiring significantly less training data. Moreover, when generalization is required, synthetic-data-trained models improve prediction accuracy by up to 50 percent compared to real-data-trained baselines, thanks to the enhanced variability and diversity of the generated traces. Overall, the proposed method bridges the gap between synthetic data generation and practical Wi-Fi traffic forecasting, providing a scalable, efficient, and real-time solution for modern wireless networks.
format Preprint
id arxiv_https___arxiv_org_abs_2601_07646
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Studying the Role of Synthetic Data for Machine Learning-based Wireless Networks Traffic Forecasting
Pulido, José
Wilhelmi, Francesc
Fortes, Sergio
Fernández-Durán, Alfonso
Giordano, Lorenzo Galati
Barco, Raquel
Systems and Control
Machine Learning
Synthetic data generation is an appealing tool for augmenting and enriching datasets, playing a crucial role in advancing artificial intelligence (AI) and machine learning (ML). Not only does synthetic data help build robust AI/ML datasets cost-effectively, but it also offers privacy-friendly solutions and bypasses the complexities of storing large data volumes. This paper proposes a novel method to generate synthetic data, based on first-order auto-regressive noise statistics, for large-scale Wi-Fi deployments. The approach operates with minimal real data requirements while producing statistically rich traffic patterns that effectively mimic real Access Point (AP) behavior. Experimental results show that ML models trained on synthetic data achieve Mean Absolute Error (MAE) values within 10 to 15 of those obtained using real data when trained on the same APs, while requiring significantly less training data. Moreover, when generalization is required, synthetic-data-trained models improve prediction accuracy by up to 50 percent compared to real-data-trained baselines, thanks to the enhanced variability and diversity of the generated traces. Overall, the proposed method bridges the gap between synthetic data generation and practical Wi-Fi traffic forecasting, providing a scalable, efficient, and real-time solution for modern wireless networks.
title Studying the Role of Synthetic Data for Machine Learning-based Wireless Networks Traffic Forecasting
topic Systems and Control
Machine Learning
url https://arxiv.org/abs/2601.07646