OBJECT DETECTION: MODEL COMPARISON ON AUTOMATED DOCUMENT CONTENT INTERPRETATION - A performance evaluation of common object detection models

Detta är en Master-uppsats från Umeå universitet/Institutionen för datavetenskap

Författare: Henrik Eklund; [2019]

Nyckelord: ;

Sammanfattning: Manually supervising a million documents yearly becomes an exhaustive task. A step towards automating this process is to interpret and classify the contents of a document. This thesis examines and evaluates different object detection models in the task to localize and classify multiple objects within a document to find the best model for the situation. The theory of artificial neural networks is explained, and the convolutional neural network and how it is used to perform object detection. Googles Tensorflow Object Detection API is implemented and models and backbones are configured for training and evaluation. Data is collected to construct the data set, and useful metrics to use in the evaluation pipeline are explained. The models are compared both category wise and summarized, using mean average precision (mAP) over several intersection over union (IoU) thresholds, inference time, memory usage, optimal localization recall precision (oLRP) error and using optimized thresholds based on localization, recall and precision. Finally we determine if some model is better suited for certain situations and tasks. When using optimal confidence thresholds different models performed best on different categories. The overall best detector for the task was considered R-FCN inceptionv2 based on its detection performance, speed and memory usage. A reflection of the results and evaluation methods are discussed and strategies for improvement mentioned as future work.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)