Sentiment Classification in Social Media : An Analysis of Methods and the Impact of Emoticon Removal

Detta är en Kandidat-uppsats från KTH/Skolan för datavetenskap och kommunikation (CSC)

Författare: Andreas Pålsson; Daniel Szerszen; [2016]

Nyckelord: ;

Sammanfattning: Sentiment classification is the process of analyzing data and classifying it based on its sentiment conveying properties and the process has a multitude of applications in different industries. However, the different application areas also introduce diverse challenges in implementing the methods successfully. This report examines two of the main approaches commonly used for sentiment classification which entail the use of machine learning and a glossary of weighted words respectively. In addition, preprocessing is explored as an enhancement to the previously mentioned approaches. The approaches are tested on data collected from Twitter to examine their performance in social media. The results indicate that lexicon-based classifiers are the most performant, and that removal of emoticons increases the correctness of classification.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)