Automatic Transcription of Historical Documents : Transkribus as a Tool for Libraries, Archives and Scholars

Detta är en Magister-uppsats från Uppsala universitet/Institutionen för ABM

Sammanfattning: Digital libraries and archives are major portals to rich sources of information. They undertake large-scale digitization to enhance their digital collections and offer users valuable text data. When it comes to handwritten documents, usually these are only provided as digitized images and not accompanied by their transcriptions. Text in non-machine-readable format restricts contemporary scholars to conduct research, especially by employing digital humanities approaches, such as distant reading and data mining. The purpose of this thesis is to evaluate Transkribus platform as a linguistic tool mainly developed for producing automatic transcriptions of handwritten documents. The results are correlated with the findings of a questionnaire distributed to libraries and archives across Europe to expand our knowledge on the policy they follow regarding manuscripts and transcription provision. A model for a specific writing style in Latin language is trained and the accuracy on various Latin handwritten pages is tested. Finally, the tool’s validation is discussed, as well as to what extent it meets the general needs of the cultural heritage institutions and of humanities scholars.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)