An evaluation of U-Net’s multi-label segmentation performance on PDF documents in a medical context

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: The Portable Document Format (PDF) is an ideal format for viewing and printing documents. Today many companies store their documents in a PDF format. However, the conversion from a PDF document to any other structured format is inherently difficult. As a result, a lot of the information contained in a PDF document is not directly accessible - this is problematic. Manual intervention is required to accurately convert a PDF into another file format - this can be deemed as both strenuous and exhaustive work. An automated solution to this process could greatly improve the accessibility to information in many companies. A significant amount of literature has investigated the process of extracting information from PDF documents in a structured way. In recent years these methodologies have become heavily dependent on computer vision. The work on this paper evaluates how the U-Net model handles multi-label segmentation on PDF documents in a medical context - extending on Stahl et al.’s work in 2018. Furthermore, it compares two newer extensions of the U-Net model, MultiResUNet (2019) and SS-U-Net (2021). Additionally, it assesses how each of the models performs in a data-sparse environment. The three models were implemented, trained, and then evaluated. Their performance was measured using the Dice coefficient, Jaccard coefficient, and percentage similarity. Furthermore, visual inspection was also used to analyze how the models performed from a perceptual standpoint. The results indicate that both the U-Net and the SS-U-Net are exceptional at segmenting PDF documents effectively in a data abundant environment. However, the SS-U-Net outperformed both the U-Net and the MultiResUNet in the data-sparse environment. Furthermore, the MultiResUNet significantly underperformed in comparison to both the U-Net and SS-U-Net models in both environments. The impressive results achieved by the U-Net and SS-U-Net models suggest that it can be combined with a larger system. This proposed system allows for accurate and structured extraction of information from PDF documents. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)