Machine Learning for Reducing the Effort of Conducting Systematic Reviews in SE

Detta är en Kandidat-uppsats från Göteborgs universitet/Institutionen för data- och informationsteknik

Sammanfattning: Objective : To investigate whether machine learning and text-based datamining can be used to support the primary studies selection process and decrease the needed effortsin systematic reviews conducted in the context of SE.Research Design : A test collection was built from 3 systematic reviews used in previous work inthe context of SE. The proposed probabilistic classifier based on Bayes’ Theorem was constructed topredict and classify each article as containing high-quality evidence to warrant inclusion in studyselection process or not. Feature engineering techniques were applied to the abstract-based features.Cross-validation experiments were performed to evaluate the efficiency of the document classifier.Three metrics - precision, recall and specificity were used together to measure the classificationperformance. We assume that a recall rate of 0.9 or higher is required for the classifier to identify ansufficient quantity of relevant papers. As long as recall is at least 0.9, the Precision and Specificityshould be as high as possible,.Results : From the hold-out cross validation experiment, the precision achieved with the classifierfor two systematic review topics, was 93%, while 79% for another systematic review topic. Theresults of leave-one-out cross validation experiment were presented in three Confusion Matrix,which in detail indicated that the precision achieved with the classifier for the three systematicreview topics was promising in terms of predicting relevant abstracts while relatively poor in termsof excluding irrelevant articles.Conclusion : The classifier based on Bayes’ Theorem has strong potential for performing thesystematic review classification tasks in software engineering. The approach presented in this papercould be considered as a possible technique for assisting labor-intensive primary studies’ selectionprocess in an SLR.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)