Classifying Electricity Tower Image Data with Unsupervised Curriculum Learning

Detta är en Magister-uppsats från Luleå tekniska universitet/Institutionen för system- och rymdteknik

Författare: Jacob Wedin; [2022]

Nyckelord: ;

Sammanfattning: The core objective of this thesis is to classify a set of aerial photographs into two categories, those containing an electricity tower and those not containing an electricity tower. While supervised deep learning methods can reach excellent results on such tasks, they require large amounts of labeled data. Manually creating these datasets is a very time-consuming process. Hence, an unsupervised method for classification would be highly useful. One such method is Unsupervised Curriculum Learning (UCL). The UCL framework consists of three main parts. The first is a deep learning convolutional neural network used to extract features from the images. The second part generates clusters based on the extracted features. Finally, the samples closest to the centroid of each cluster are used to train the CNN, using the cluster assignments as labels. The dataset used consists of high-resolution grayscale aerial photographs. For this thesis, the dataset has been cleaned and balanced. The thesis compares the performance of the UCL framework with that of a supervised model, on this dataset. Several models have been evaluated. A novel method for determining which samples to include in fine-tuning, suited for datasets where one of the classes have a much higher average similarity than the other, has also been introduced and evaluated. The best performing UCL model, in which the first half of the layers were frozen,was able to reach an accuracy of 0.87 on a balanced and cleaned dataset. The thesis conclusion is that UCL is a viable method for classifying the data used,albeit less accurate than a supervised method. The novel method for selecting samples for the fine-tuning stage performed well, and is recommended for cases where the different classes have highly differing average similarity scores. As suggestions for further research, it is possible that even better results could be attained using a different clustering algorithm or by reducing the dimensionality of the extracted deep feature vector. Another recommendation would be to work with unbalanced data.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)