Exploring Advanced Clustering Techniques for Business Descriptions : A Comparative Study and Analysis of DBSCAN, K-Means, and Hierarchical Clustering

Detta är en Uppsats för yrkesexamina på avancerad nivå från Mittuniversitetet/Institutionen för data- och elektroteknik (2023-)

Sammanfattning: In this study, we introduce several approaches to analyze large volumes of business descriptions by applying machine learning clustering and classification algorithms. The goal is to efficiently classify these descriptions, reducing the search scope and allowing for better business insights and decision-making processes. By using unlabeled business description data, we apply Agglomerative Hierarchical Clustering (AHC), K-means, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithms. Various preprocessing techniques, parameters and cluster numbers are employed for each method, aiming to maximize the number of overlapping and get the right similarity scores within the resulting clusters. The best number of overlapping are obtained using AHC, followed by K-means and DBSCAN, based on the implemented evaluation metrics. The conclusions drawn from this project have the potential to improve and contribute to the development of automated systems for business description analysis. Furthermore, this research opens the way for further exploration and enhancements in the application of machine learning techniques to business analytics.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)