Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Santhi, Neagin Neasamoni, Villa, Davide, Polese, Michele, D'Oro, Salvatore, Lee, Yunseong, Furueda, Koichiro, Melodia, Tommaso
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2604.23397
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866913062528745472
author Santhi, Neagin Neasamoni
Villa, Davide
Polese, Michele
D'Oro, Salvatore
Lee, Yunseong
Furueda, Koichiro
Melodia, Tommaso
author_facet Santhi, Neagin Neasamoni
Villa, Davide
Polese, Michele
D'Oro, Salvatore
Lee, Yunseong
Furueda, Koichiro
Melodia, Tommaso
contents Artificial Intelligence (AI) has become a powerful tool for model-free Radio Access Network (RAN) signal processing and optimization. However, designing a single model that generalizes across all radio environments is challenging. Specialized AI models outperform conventional algorithms only under specific conditions, while their higher compute and energy cost makes unconditional execution impractical at the base station. This creates a need for real-time expert switching: dynamically activating the most appropriate AI or conventional expert based on current network conditions. To address this, we propose ARCHES (Adaptive Real-time CUDA Hot-swapping of Experts in the RAN Stack), a framework hosting multiple AI-based and conventional signal processing experts within a GPU-accelerated PHY pipeline, dynamically selecting the most appropriate expert at slot-boundary granularity without dropping or corrupting in-flight data. ARCHES includes a lightweight CUDA switch kernel for zero-gap output selection, a dApp-based control plane that collects cross-layer telemetry and drives the switching policy, and a reusable process for policy design based on controlled perturbation, monotonicity filtering, and hierarchical clustering. We validate ARCHES on UL channel estimation, switching between an AI-based and a Minimum Mean Square Error (MMSE) estimator under changing propagation and interference conditions. Implemented on the X5G platform with NVIDIA Aerial and OpenAirInterface (OAI), ARCHES achieves median UL PHY throughput gains of 5.32% and 7.23% under good and poor conditions, with a control-loop latency of ~140 us and sub-microsecond decision inference. Under good conditions, defaulting to MMSE saves 15.8 W of GPU power (9.6%) and 17 percentage points of GPU utilization versus unconditional AI execution, validating the performance-per-watt tradeoff that motivates adaptive expert selection.
format Preprint
id arxiv_https___arxiv_org_abs_2604_23397
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle ARCHES: Adaptive Real-Time Switching of AI Models for the RAN
Santhi, Neagin Neasamoni
Villa, Davide
Polese, Michele
D'Oro, Salvatore
Lee, Yunseong
Furueda, Koichiro
Melodia, Tommaso
Networking and Internet Architecture
Artificial Intelligence (AI) has become a powerful tool for model-free Radio Access Network (RAN) signal processing and optimization. However, designing a single model that generalizes across all radio environments is challenging. Specialized AI models outperform conventional algorithms only under specific conditions, while their higher compute and energy cost makes unconditional execution impractical at the base station. This creates a need for real-time expert switching: dynamically activating the most appropriate AI or conventional expert based on current network conditions. To address this, we propose ARCHES (Adaptive Real-time CUDA Hot-swapping of Experts in the RAN Stack), a framework hosting multiple AI-based and conventional signal processing experts within a GPU-accelerated PHY pipeline, dynamically selecting the most appropriate expert at slot-boundary granularity without dropping or corrupting in-flight data. ARCHES includes a lightweight CUDA switch kernel for zero-gap output selection, a dApp-based control plane that collects cross-layer telemetry and drives the switching policy, and a reusable process for policy design based on controlled perturbation, monotonicity filtering, and hierarchical clustering. We validate ARCHES on UL channel estimation, switching between an AI-based and a Minimum Mean Square Error (MMSE) estimator under changing propagation and interference conditions. Implemented on the X5G platform with NVIDIA Aerial and OpenAirInterface (OAI), ARCHES achieves median UL PHY throughput gains of 5.32% and 7.23% under good and poor conditions, with a control-loop latency of ~140 us and sub-microsecond decision inference. Under good conditions, defaulting to MMSE saves 15.8 W of GPU power (9.6%) and 17 percentage points of GPU utilization versus unconditional AI execution, validating the performance-per-watt tradeoff that motivates adaptive expert selection.
title ARCHES: Adaptive Real-Time Switching of AI Models for the RAN
topic Networking and Internet Architecture
url https://arxiv.org/abs/2604.23397