Sentiment Analysis for Swedish : The Impact of Emojis on Sentiment Analysis of Swedish Informal Texts

Detta är en Master-uppsats från Umeå universitet/Institutionen för datavetenskap

Sammanfattning: This study investigates the use of emojis in sentiment analysis for the Swedish language, with the objective to assess if emojis improve the performance of the model. Sentiment analysis is an NLP classification task aimed at extracting people's opinions, sentiments, and attitudes from language. Though sentiment analysis as a research area has made a lot of progress recently, there are still some challenges to overcome. In this work, two of these challenges were considered; the analysis of a non-English language and the impact of emojis. These areas were explored through creating a sentiment annotated dataset of Swedish texts containing emojis, and creating a Swedish sentiment analysis model for evaluation. The sentiment analysis model created, SweVADER, was based on the English Lexicon-based model VADER.  The best performing SweVADER model achieved an accuracy of 0.53 and an F1-score of 0.47. Furthermore, the presence of emojis improved the analysis for most models, but not by much. The results indicate that the use of emojis can improve the sentiment analysis, but there were other features affecting the results as well. The sentiment lexicon used plays a key role, and pre-processing techniques like stemming could affect the performance too. A takeaway from this study is that emojis contain important sentiment information, and should not be disregarded. Furthermore, emojis are useful when analyzing texts, if there is a lack of linguistic resources for the language in question.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)