Semantic Image Segmentation on Clothing Imagery with Deep Neural Networks

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Helena Alinder; [2020]

Nyckelord: ;

Sammanfattning: Semantic Image Segmentation is a field within machine learning and computer vision, where the goal is to link each pixel in an image with a label. A successful segmentation will label all pixels that belong to an object with the correct label, and this prediction can be measured with a score known as mean Intersection over Union (mIoU). In a selling process of second-hand clothes, the clothes are placed on a mannequin and then photographed and post-processed. The post-processing algorithm attempts to remove the pole of the mannequin and crop out the mannequin itself to create a clear background. The algorithm uses traditional computer vision and requires specific lighting and position settings, and if these settings are faulty the algorithm performs bad. This thesis investigates how to conduct Semantic Image Segmentation with Deep Neural Networks for removing the pole and cropping out the mannequin, and if the networks perform better than the traditional algorithm on images with bad lighting. Two deep neural networks were investigated: DeepLabv3+ and GatedShape CNN. The models’ performances were measured by their mIoU score and evaluated on a regular clothing dataset and an augmented clothing dataset, consisting of images that the traditional algorithm had problems with segmenting. The conclusion of the thesis is that DeepLabv3+ performs better than Gated-Shape CNN on regular clothing imagery, reaching an overall mIoU of 91.81%, and the overall performances of the networks on regular clothing imagery are statistically significantly different. DeepLabv3+ also performs better than the traditional algorithm when segmenting augmented clothing imagery, images that the traditional algorithm had problems with segmenting, and the overall performances are statistically significantly different. There is no statistically significant difference between the overall performance of DeepLabv3+ and GSCNN and the overall performance of GSCNN and the traditional algorithm when segmenting augmented images.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)