Unsupervised Clustering of Behavior Data From a Parking Application : A Heuristic and Deep Learning Approach

Detta är en Uppsats för yrkesexamina på avancerad nivå från Umeå universitet/Institutionen för matematik och matematisk statistik

Sammanfattning: This report aims to present a project in the field of unsupervised clustering on human behavior in a parking application. With increasing opportunities to collect and store data, the demands to utilize the data in meaningful ways also increase. The purpose of this work is to explore common behaviors within the app and what those reveal about its usage. Transforming event based data into user sessions was the first step. The next step was to establish how to measure the similarity between sequences. This was achieved using two different approaches. One approach based on a combination of string metrics and heuristics. The other approach creates array representations of the sessions using an autoencoder. With these two ways of representing the similarity between sessions, we utilize clustering algorithms to assign labels to all sessions. Due to the unknown attributes of the data set, the versatile clustering algorithm HDBSCAN was employed on both representations of the session separately. The clusters produced by HDBSCAN were compared to those produced by simple partitioning algorithms. The noisy nature of human behavior allowed HDBSCAN to create better clusters with distinct behaviors in comparison to the simpler partitioning algorithms. Without a ground truth to rely on, evaluating the models proved to be a difficult part of the project. We utilized both quantitative metrics, as well as qualitative methods for evaluation. In conclusion, our work provides a new way of evaluating user behavior. It brings new insights into different ways the customer achieves their goals within the app. And finally it lays ground for connecting user behavior with transaction data.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)