Comparing machine learning methods for classification and generation of footprints of buildings from aerial imagery

Detta är en Uppsats för yrkesexamina på avancerad nivå från Blekinge Tekniska Högskola/Institutionen för programvaruteknik

Sammanfattning: The up to date mapping data is of great importance in social services and disaster relief as well as in city planning. The vast amounts of data and the constant increase of geographical changes lead to large loads of continuous manual analysis. This thesis takes the process of updating maps and breaks it down to the problem of discovering buildings by comparing different machine learning methods to automate the finding of buildings. The chosen methods, YOLOv3 and Mask R-CNN, are based on Region Convolutional Neural Network(R-CNN) due to their capabilities of image analysis in both speed and accuracy. The image data supplied by Lantmäteriet makes up the training and testing data; this data is then used by the chosen machine learning methods. The methods are trained at different time limits, the generated models are tested and the results analysed. The results lay ground for whether the model is reasonable to use in a fully or partly automated system for updating mapping data from aerial imagery. The tested methods showed volatile results through their first hour of training, with YOLOv3 being more so than Mask R-CNN. After the first hour and until the eight hour YOLOv3 shows a higher level of accuracy compared to Mask R-CNN. For YOLOv3, it seems that with more training, the recall increases while precision decreases. For Mask R-CNN, however, there is some trade-off between the recall and precision throughout the eight hours of training. While there is a 90 % confidence interval that the accuracy of YOLOv3 is decreasing for each hour of training after the first hour, the Mask R-CNN method shows that its accuracy is increasing for every hour of training,however, with a low confidence and can therefore not be scientifically relied upon. Due to differences in setups the image size varies between the methods, even though they train and test on the same areas; this results in a fair evaluation where YOLOv3 analyses one square kilometre 1.5 times faster than the Mask R-CNN method does. Both methods show potential for automated generation of footprints, however, the YOLOv3 method solely generates bounding boxes, leaving the step of polygonization to manual work while the Mask R-CNN does, as the name implies, create a mask of which the object is encapsulated. This extra step is thought to further automate the manual process and with viable results speed up the updating of map data.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)