Anomaly Detection in an e-Transaction System using Data Driven Machine Learning Models : An unsupervised learning approach in time-series data

Detta är en Kandidat-uppsats från Blekinge Tekniska Högskola/Institutionen för datavetenskap

Sammanfattning: Background: Detecting anomalies in time-series data is a task that can be done with the help of data driven machine learning models. This thesis will investigate if, and how well, different machine learning models, with an unsupervised approach,can detect anomalies in the e-Transaction system Ericsson Wallet Platform. The anomalies in our domain context is delays on the system. Objectives: The objectives of this thesis work is to compare four different machine learning models ,in order to find the most relevant model. The best performing models are decided by the evaluation metric F1-score. An intersection of the best models are also being evaluated in order to decrease the number of False positives in order to make the model more precise. Methods: Investigating a relevant time-series data sample with 10-minutes interval data points from the Ericsson Wallet Platform was used. A number of steps were taken such as, handling data, pre-processing, normalization, training and evaluation.Two relevant features was trained separately as one-dimensional data sets. The two features that are relevant when finding delays in the system which was used in this thesis is the Mean wait (ms) and the feature Mean ' N were the N is equal to the Number of calls to the system. The evaluation metrics that was used are True positives, True Negatives, False positives, False Negatives, Accuracy, Precision, Recall, F1-score and Jaccard index. The Jaccard index is a metric which will reveal how similar each algorithm are at their detection. Since the detection are binary, it’s classifying the each data point in the time-series data. Results: The results reveals the two best performing models regards to the F1-score.The intersection evaluation reveals if and how well a combination of the two best performing models can reduce the number of False positives. Conclusions: The conclusion to this work is that some algorithms perform better than others. It is a proof of concept that such classification algorithms can separate normal from non-normal behavior in the domain of the Ericsson Wallet Platform.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)