Extracting relevant answer phrases from text : For usage in reading comprehension question generation

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: This report presents a method for extracting answer phrases, suitable as answers to reading comprehension questions, from Swedish text. All code used to produce the results is available on github'. The method is developed using a Swedish BERT, a pre-trained language model based on neural networks. The BERT model is fine-tuned for three different tasks; two variations of token classification for answer extraction, and one for sentence classification with the goal of identifying relevant sentences. The dataset used for fine-tuning consists of 1814 question and answer pairs posed on 598 different texts, partitioned into a training, a validation and a test set. The models are assessed individually and are furthermore combined, using a method based on roundtrip consistency, into a system for filtering extracted answer phrases. The results for each of the models, and for the system combining them are evaluated both on quantitative measures (precision, recall and Jaccard index) and qualitative measures. Within the qualitative evaluation we both look at results produced by the models and conduct structured human evaluation with the help of four external evaluators. The final answer extraction model achieves a precision of 0.02 and recall of 0.95, with an average Jaccard index of 0.55 between the extracted answer phrases and the targets. When applying the system for filtering the precision is 0.03, the recall 0.50 and the Jaccard index 0.62 on a subset of the test data. The answer extraction model achieves the same results as the baseline on precision, outperforms it on recall by a large margin, and has worse results than the baseline on Jaccard index. The method applying filtering, which is evaluated on a subset of the test set, has worse precision than the baseline but outperform it on both recall and Jaccard index. In the qualitative evaluation we detect some flaws in the grammatical correctness of the extracted answers, as over 50% of them are classified as not grammatically correct. The joint result of the two evaluators on suitability show that 32% of the grammatically correct answers are suitable as answer phrases.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)