Anomaly Detection in Log Files Using Machine Learning

Detta är en M1-uppsats från Luleå tekniska universitet/Institutionen för system- och rymdteknik

Sammanfattning: Logs generated by the applications, devices, and servers contain information that can be used to determine the health of the system. Manual inspection of logs is important, for example during upgrades, to determine whether the upgrade and data migration were successful. However, manual testing is not reliable enough, and manual inspection of logs is tedious and time-­consuming. In this thesis, we propose to use the machine learning techniques K­means and DBSCAN to find anomaly sequences in log files. This research also investigated two different kinds of data representation techniques, feature vector representation, and IDF representation. Evaluation metrics such as F1 score, recall, and precision were used to analyze the performance of the applied machine learning algorithms. The study found that the algorithms have large differences regarding detection of anomalies, in which the algorithms performed better in finding the different kinds of anomalous sequences, rather than finding the total amount of them. The result of the study could help the user to find anomalous sequences, without manually inspecting the log file.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)