Geolocating Alfa Laval's products using supervised machine learning

Detta är en Master-uppsats från Lunds universitet/Institutionen för datavetenskap

Sammanfattning: A lot of companies have data that can be used to develop a more successful business. To become more data-driven, it is important to extract valuable information from the raw data. One of the largest challenges for companies, while trying to make this transition, is to ensure a data quality at a high level. In this thesis, we worked with Alfa Laval’s database of previously sold products. The main issue with this database was the lack of existing locations, where the products have been installed. In this thesis, we report a solution for the hierarchical prediction of geolocation on three levels: country, city, and coordinates. To build a solution, we examined the three tasks using four different supervised machine learning algorithms. Given our prior knowledge and the available attributes in the database, most tasks proved to yield surprisingly good results. The prediction of countries and cities globally achieved an accuracy of 71% and 57%, respectively. Random forests was the overall best performing algorithm for these two tasks. The prediction of coordinates for the United States was a harder task, resulting in a mean error distance of 872 km, which was achieved by an implementation of artificial neural networks. Our results showed that a prediction of country and city in fact was an achievable goal, even if the existing input did not have an obvious connection to a location. On the other hand, predicting coordinates did not give a result with a sufficiently small margin of error to be useful for most applications.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)