Named Entity Recognition for Detecting Trends in Biomedical Literature

Detta är en Master-uppsats från Umeå universitet/Institutionen för datavetenskap

Författare: Betty Törnkvist; [2024]

Nyckelord: NLP NER CHO;

Sammanfattning: The number of publications in the biomedical field increases exponentially, which makes the task of keeping up with current research more and more difficult. However, rapid advances in the field of Natural Language Processing (NLP) offer possible solutions to this problem. In this thesis we focus on investigating three main questions of importance for utilizing the field of NLP, or more specifically the two subfields Named Entity Recognition (NER) and Large Language Models (LLM), to help solve this problem. The questions are; comparing LLM performance to NER models on NER-tasks, the importance of normalization, and how the analysis is affected by the availability of data. We find for the first question that the two models offer a reasonably comparable performance for the specific task we are looking at. For the second question, we find that normalization plays a substantial role in improving the results for tasks involving data synthesis and analysis. Lastly, for the third question, we find that it is important to have access to full papers in most cases since important information can be hidden outside of the abstracts.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)