Machine Learning for Automation of Chromosome based Genetic Diagnostics

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Chromosome based genetic diagnostics, the detection of specific chromosomes, plays an increasingly important role in medicine as the molecular basis of hu- man disease is defined. The current diagnostic process is performed mainly by karyotyping specialists. They first put chromosomes in pairs and generate an image listing all the chromosome pairs in order. This process is called kary- otyping, and the generated image is called karyogram. Then they analyze the images based on the shapes, size, and relationships of different image segments and then make diagnostic decisions. Manual inspection is time-consuming, labor-intensive, and error-prone.This thesis investigates supervised methods for genetic diagnostics on karyo- grams. Mainly, the theory targets abnormality detection and gives the confi- dence of the result in the chromosome domain. This thesis aims to divide chromosome pictures into normal and abnormal categories and give the con- fidence level. The main contributions of this thesis are (1) an empirical study of chromosome and karyotyping; (2) appropriate data preprocessing; (3) neu- ral networks building by using transfer learning; (4) experiments on different systems and conditions and comparison of them; (5) a right choice for our requirement and a way to improve the model; (6) a method to calculate the confidence level of the result by uncertainty estimation.Empirical research shows that the karyogram is ordered as a whole, so preprocessing such as rotation and folding is not appropriate. It is more rea- sonable to choose noise or blur. In the experiment, two neural networks based on VGG16 and InceptionV3 were established using transfer learning and com- pared their effects under different conditions. We hope to minimize the error of assuming normal cases because we cannot accept that abnormal chromo- somes are predicted as normal cases. This thesis describes how to use Monte Carlo Dropout to do uncertainty estimation like a non-Bayesian model[1].

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)