Automation of manual tasks at Statistics Sweden : Supervised machine learning as proof-of-concept

Detta är en Master-uppsats från Uppsala universitet/Institutionen för informationsteknologi

Författare: Simon Godskesen; [2024]

Nyckelord: ;

Sammanfattning: Supervised machine learning is used to create a proof-of-concept for automation of manual tasks for Statistics Sweden. The goal of the first part is to classify occupations with an SSYK code using descriptions entered by the employee, the education level of the employee, and their industry. 70% of the data could get a 4-digit SSYK code with 90% accuracy, and out of the remaining 30%, 62.3% can get up to five 3-level codes with 66.4% accuracy. This result used Word2Vec together with Random Forest. The second part is to classify food items using the name of the item, and some initial guess provided by the stores. The model is trained on data from years 2014-2021 and tested on 2022 data, where items with EAN codes matching items in the training set were removed. 72% accuracy was achieved using OHE and logistic regression.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)