An inquiry into the efficacy ofconvolutional neural networks in low-resolution video feeds for object detection

Detta är en Master-uppsats från KTH/Matematisk statistik

Sammanfattning: In this thesis, various famous models have been investigated and compared to a custom model for people detection in low resolution video feeds. YOLOv3 and SSD in particular are famous models which have, at their time, produced state of the art results on competitions such as ImageNet and COCO. The performance of all models have been compared on speed and accuracy where it was found that YOLOv3 was the slowest and SSD was the fastest. The proposed model was superior in accuracy to both of the aforementioned architectures which can be attributed to addition of newer techniques from research such as leaving activations out and having a carefully balanced loss function. The results seem to suggest that the proposed model is implementable for real-time inference using cheap hardware such as a raspberry pi 3B+ coupled with one or more AI accelerator stickssuch as the Intel Neural Compute Stick 2 and that the networks are usable for detection even in bad video streams.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)