Grammatical Error Correction for Learners of Swedish as a Second Language

Detta är en Master-uppsats från Uppsala universitet/Institutionen för lingvistik och filologi

Författare: Martina Nyberg; [2022]

Nyckelord: grammatical error correction; swedish; machine translation; language modeling; machine learning;

Sammanfattning: Grammatical Error Correction refers to the task of automatically correcting errors in written text, typically with respect to texts written by learners of a second language. The work in this thesis implements and evaluates two methods to Grammatical Error Correction for Swedish. In addition, the proposed methods are compared to an existing, rule-based system. Previous research on GEC for the Swedish language is limited and has not yet utilized the potential of neural networks. The first method implemented in this work is based on a neural machine translation approach, training a Transformer model to translate erroneous text into a corrected version. A parallel dataset containing artificially generated errors is created to train the model. The second method utilizes a Swedish version of the pre-trained language model BERT to estimate the likelihood of potential corrections in an erroneous text. Employing the SweLL gold corpus consisting of essays written by learners of Swedish, the proposed methods are evaluated using GLEU and through a manual evaluation based on the types of errors and their corresponding corrections found in the essays. The results show that the two methods correct approximately the same amount of errors, while differing in terms of which error types that are best handled. Specifically, the translation approach has a wider coverage of error types and is superior for syntactical and punctuation errors. In contrast, the language model approach yields consistently higher recall and outperforms the translation approach with regards to lexical and morphological errors. To improve the results, future work could investigate the effect of increased model size and amount of training data, as well as the potential in combining the two methods.

HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)

Grammatical Error Correction for Learners of Swedish as a Second Language

Sökningar just nu

Populära sökningar

Uppsatser med många visningar igår (2024-04-27)