Software Fault Detection in Telecom Networks using Bi-level Federated Graph Neural Networks

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: The increasing complexity of telecom networks, induced by the recent development of 5G, is a challenge for detecting faults in the telecom network. In addition to the structural complexity of telecommunication systems, data accessibility has become an issue both in terms of privacy and access cost. We propose a method relying on bi-level Federated Graph Neural Networks to identify anomalies in the telecom network while ensuring reduced communication costs as well as data privacy. Our method considers telecom data as a bi-level graph, where the highest level graph represents the interaction between sites, and each site is further expanded to its software (SW) performance behaviour graph. We developed and compared 4G/5G SW Fault Detection models under 3 settings: (1) Centralized Temporal Graph Neural Networks model: we propose a model to detect anomalies in 4G/5G telecom data. (2) Federated Temporal Graph Neural Networks model: we propose Federated Learning (FL) as a mechanism for privacy-aware training of models for fault detection. (3) Personalized Federated Temporal Graph Neural Networks model: we propose a novel aggregation technique, referred to as FedGraph, leveraging both a graph and the similarities between sites for aggregating the models and proposing models more personalized to each site’s behaviour. We compare the benefits of Federated Learning (FL) models (2) and (3) with centralized training (1) in terms of SW performance data modelling, anomaly detection, and communication cost. The evaluation includes both a scenario with normal functioning sites and a scenario where only a subset of sites exhibit faulty behaviour. The combination of SW execution graphs with GNNs has shown improved modelling performance and minor gains in centralized settings (1). In a normal network context, FL models (2) and (3) perform comparably to centralized training (CL), with slight improvements observed when using the personalized strategy (3). However, in abnormal network scenarios, Federated Learning falls short of achieving comparable detection performance to centralized training. This is due to the unintended learning of abnormal site behaviour, particularly when employing the personalized model (3). These findings highlight the importance of carefully assessing and selecting suitable FL strategies for anomaly detection and model training on telecom network data.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)