Evaluating CNN Architectures on the CSAW-M Dataset

Detta är en Kandidat-uppsats från KTH/Datavetenskap

Sammanfattning: CSAW-M is a dataset that contains about 10 000 x-ray images created from mammograms. Mammograms are used to identify patients with breast cancer through a screening process with the goal of catching cancer tumours early. Modern convolutional neural networks are very sophisticated and capable of identifying patterns nearly indistinguishable to humans. CSAW-M doesn’t contain images of active cancer tumours, rather, whether the patient will develop cancer or not. Classification tasks such as this are known to require large datasets for training, which is cumbersome to acquire in the biomedical domain. In this paper we investigate how classification performance of non-trivial classification tasks scale with the size of available annotated images. To research this, a wide range of data-sets are generated from CSAW-M, with varying sample size and cancer types. Three different convolutional neural networks were trained on all data-sets. The study showed that classification performance does increase with the size of the annotated dataset. All three networks generally improved their prediction on the supplied benchmarking dataset. However, the improvements were very small and the research question could not be conclusively answered. The primary reasons for this was the challenging nature of the classification task, and the size of the data-set. Further research is required to gain more understanding of how much data is needed to yield a usable model.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)