Spelling Correction in a Music Entity Search Engine by Learning from Historical Search Queries

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Maria Movin; [2018]

Nyckelord: Machine Learning; LSTM; Music search; Query Suggestion;

Sammanfattning: Query spelling correction is an important component of modern search engines that can help users to express their intent, and thus improve search quality. In this study, we investigated with what accuracy a sequence-to-sequence recurrent neural network (RNN) can recognise and correct misspellings in a music search engine, when the model is trained with old search queries. A sequence-to-sequence RNN was chosen as the model in this study since it has achieved state-of-the-art performance on similar tasks, such as machine translation and speech recognition. The findings from the study imply that the model learns to correct and complete queries with higher accuracy compared to a baseline model that returns the input query. However, we suggest that, for a model that would be good enough for production, more work needs to be done. Especially, work on creating a cleaner, less biased training dataset. Nevertheless, our work strengthens the idea that sequence-to-sequence RNNs could be used as a spell correction system in search engines.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)