Tracking the Time Trends of Swedish Literature and Finding Characteristics of Authors by Using Topic Modelling

Detta är en Master-uppsats från Uppsala universitet/Institutionen för lingvistik och filologi

Sammanfattning: In this thesis, we discover the time trends in Swedish literature and characteristics of authors. We apply latent Dirichlet allocation (LDA), a method for topic modelling, to a corpus composed of 118 Swedish books and prose collected in Litteraturbanken. By using the LDA model, we observe two findings: topics that focus on daily life, such as nature or family are frequently observed in the corpus, and peaks of topics in time trends result from books on the same topic written by several authors or books written by an author in a short time. Additionally, LDA is applicable to assessments of the characteristics of authors. We list the particular topics for nine authors with more than three books in the corpus by comparing the topic distribution of those authors to the topic distribution of the entire corpus. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)