Transforming Legal Entity Recognition

Detta är en Master-uppsats från Uppsala universitet/Statistiska institutionen

Sammanfattning: Transformer-based architectures have in recent years advanced state-of-the-art performance in Natural Language Processing. Researchers have successfully adapted such models to downstream tasks within NLP in a domain-specific setting. This thesis examines the application of these models to the legal domain by doing Named Entity Recognition (NER) in a setting of scarce training data. Three different pre-trained BERT models are fine-tuned on a set of 101 court case documents, whereof one model is pre-trained on legal corpora and the other two on general corpora. Experiments are run to evaluate the models’ predictive performance given smaller or larger quantities of data to fine-tune on. Results show that BERT models work reasonably well for NER with legal data. Unlike many other domain-specific BERT models, the BERT model trained on legal corpora does not outperform the base models. Modest amounts of annotated data seem sufficient for reasonably good performance. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)