Summarizing Product Reviews Using Dynamic Relation Extraction

Detta är en Uppsats för yrkesexamina på avancerad nivå från Lunds universitet/Institutionen för datavetenskap

Sammanfattning: The accumulated review data for a single product on Amazon.com could po- tentially take several weeks to examine manually. Computationally extracting the essence of a document is a substantial task, which has been explored pre- viously through many different approaches. We explore how statistical predic- tion can be used to perform dynamic relation extraction. Using patterns in the syntactic structure of a sentence, each word is classified as either product fea- ture or descriptor, and then linked together by association. The classifiers are trained with a manually annotated training set and features from dependency parse trees produced by the Stanford CoreNLP library. In this thesis we compare the most widely used machine learning algo- rithms to find the one most suitable for our scenario. We ultimately found that the classification step was most successful with SVM, reaching an FS- core of 80 percent for the relation extraction classification step. The results of the predictions are presented in a graphical interface displaying the relations. An end-to-end evaluation was also conducted, where our system achieved a relaxed recall of 53.35%.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)