Automatic Question Paraphrasing in Swedish with Deep Generative Models

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Paraphrase generation refers to the task of automatically generating a paraphrase given an input sentence or text. Paraphrase generation is a fundamental yet challenging natural language processing (NLP) task and is utilized in a variety of applications such as question answering, information retrieval, conversational systems etc. In this study, we address the problem of paraphrase generation of questions in Swedish by evaluating two different deep generative models that have shown promising results on paraphrase generation of questions in English. The first model is a Conditional Variational Autoencoder (C-VAE) and the other model is an extension of the first one where a discriminator network is introduced into the model to form a Generative Adversarial Network (GAN) architecture. In addition to these models, a method not based on machine-learning was implemented to act as a baseline. The models were evaluated using both quantitative and qualitative measures including grammatical correctness and equivalence to source question. The results show that the deep generative models outperformed the baseline across all quantitative metrics. Furthermore, from the qualitative evaluation it was shown that the deep generative models outperformed the baseline at generating grammatically correct sentences, but there was no noticeable difference in terms of equivalence to the source question between the models. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)