Simultaneous Classification of Sets of Images Using Deep Learning and Clustering

Detta är en Master-uppsats från Lunds universitet/Matematik LTH

Författare: Nellie Carleke; Hugo Sellerberg; [2020]

Nyckelord: Mathematics and Statistics;

Sammanfattning: Classification of cell images is conventionally done manually in hematology laboratories by medical technologists. CellaVision aims to automate this work in order to make the analysis process faster, better and more flexible. The automatic classification is currently done by processing each individual cell image through a Convolutional Neural Network. This methodology does not exploit any correlations that might exist between cells from the same blood sample. We suggest a method to first compress the images of a whole sample using a Convolutional Neural Network and a Variational Autoencoder, then cluster these compressed data points using DBSCAN clustering and Bayesian Optimization, and finally assign a cell class to each cluster using statistical tools such as Earth Mover's Distance. We used data from CellaVision's system DC-1 to train a Convolutional Neural Network with 90.68% accuracy on training data and 82.85% accuracy on test data. This was used both as a benchmark and as the foundation to our method. We managed to enhance the accuracies to 90.90% on training data and 83.13% on test data by applying our method. We explored the feasibility of using our method on mixed cell data from different systems, but the results were not as good as on DC-1 data. Applying our method on images of handwritten digits from the MNIST dataset could be made advantageous by forming customized subsets of images. This indicates that our method is versatile enough to use on general image data, provided that correlations within the subsets exist.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)