Log Anomaly Detection of Structured Logs in a Distributed Cloud System

Detta är en Uppsats för yrkesexamina på avancerad nivå från Lunds universitet/Institutionen för reglerteknik

Författare: David Nilsson; Albin Olsson; [2022]

Nyckelord: Technology and Engineering;

Sammanfattning: As computer systems grow larger and more complex, the task of maintaining the system and finding potential security threats or other malfunctions become increasingly hard. Traditionally, this has had to be done by manually examining the logs. In modern systems, this can become infeasible due to either the large amount of logs or the complexity of the system. By using machine learning based anomaly detection to analyze system logs, this can be done automatically. In this thesis the authors have researched the area of anomaly detection, and implemented an anomaly detection pipeline for a specific system. Three different machine learning based anomaly detection models were implemented, namely a clustering algorithm, PCA, and a neural network in the form of an autoencoder. These models were compared and evaluated with regards to a baseline error detection system, which was already in place for the target system. They were also compared against each other to find which models performed best, and in which circumstances. To compare the models, six different types of known anomalies were injected into the data. When comparing the performances of the different methods, all of them were found to outperform the baseline system. In the first experiment, where the models were trained and tested using data from the same time period, PCA achieved the highest F1-score of 0.990. In the second experiment the models were trained and tested using data from separate time periods. In this scenario, the clustering algorithm outperformed the others, with an F1-score of 0.879. Both PCA and the autoencoder found many false positives, reducing their precision and thereby their F1-score.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)