Mapping of open-answers using machine learning

Detta är en Master-uppsats från KTH/Matematisk statistik

Författare: Viking Björk Friström; [2018]

Nyckelord: ;

Sammanfattning: This thesis investigates if a model can be created to map misspelled answers from open-ended questions to a ﬁnite set of brands. The data used for the paper comes from the company Nepa that uses open-questions to measure brand-awareness and consists of misspelled answers and brands to be mapped to. A data structure called match candidate was created and consists of a misspelled answer and brand that it poten-tially be mapped to. Features for the match candidates were engineered and based on the edited distances, posterior probability and common misspellings among other. Multiple machine learning models were tested for classifying the match candidates as positive if the mapping was correct and negative otherwise. The model was tested in two scenarios, one when the answers in the training and testing data came from the same questions and secondly when they came from diﬀerent ones. Among the classiﬁers tested, the random forest model performed best in terms of PPV as well as sensitivity. The resulting mapping identiﬁed on average 92% of the misspelled answers and map then with 98% accuracy in the ﬁrst scenario. While in the second scenario 70% of the answers were identiﬁed with 95% conﬁdence in the mapping on average.

HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)

Mapping of open-answers using machine learning

Sökningar just nu

Populära sökningar

Uppsatser med många visningar igår (2024-04-17)