Exploring Object Detection and Recognition Methods for Automated Book Inventory

Detta är en Master-uppsats från Lunds universitet/Matematik LTH

Sammanfattning: As more modern technical devices with cameras are accessible to people, more photos are taken and the amount of data that is available has vastly increased. Researchers have learnt to utilise this and use images for various purposes with Deep Learning based Computer Vision. For instance, one can teach a computer to perform an automated inventory, which could be a very time consuming task. Librarians are some of the affected people, they spend a lot of their time to arrange and go through the books on the bookshelves. In an attempt to automate the inventory of books, different object detection and recognition methods have been explored in this Master Thesis. The state of the art object detection method Mask Region Convolutional Neural Network (Mask R-CNN) was used to find book spines in images and two Convolutional Neural Networks (CNN) with different loss functions; categorical cross-entropy and triplet loss, were used to recognise each book the Mask R-CNN had found. In addition, various data augmentation techniques were used to artificially expand and diversify the training dataset. The Mask R-CNN reached a mean Average Precision (mAP) of 94% on one of the test sets and the recognition networks each got 90% accuracy on the books that had been found.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)