Predicting Biomarkers/ Candidate Genes involved in iALL, using Rough Sets based Interpretable Machine Learning Model.

Detta är en Master-uppsats från Uppsala universitet/Institutionen för biologisk grundutbildning

Sammanfattning: Acute lymphoblastic leukemia is a hematological malignancy that gains a proliferative advantage and originates in the bone marrow. One of the more common genetic alterations in ALL is KMT2A-rearrangement which constitutes 80% of the cases of ALL in infants. Patients carrying the KMT2A rearrangement have a poor prognosis and will eventually develop drug resistance. This project aimed to find new therapeutic targets which would help in the development of novel drugs. We designed a model which uses gene expression data, to infer expressions of oncogenes and the genes which could be associated with immune pathways. The data was extracted and transformed by removing the batch effects and identifying the biotypes of these genes for more focused research. Here we utilized exome RNA-seq,  hence it was necessary to reduce the high dimensionality of the data. The dimensionality reduction was performed using Monte Carlo Feature Selection. After the feature selection, a list of highly significant genes was obtained. These genes were used in a machine learning model, R.ROSETTA, which produces rule-based results centered on rough sets theory. The rules were visualized using VisuNet, an interactive tool that creates networks from the rules. Among others, we identified levels of expressions of genes such as JAK3, TOX3, and DMRTA1 and their relations to other genes  using the machine learning model. These significant genes were also used to do pathway analysis using pathfindR which allowed us to infer the oncogenic pathways. The pathway analysis helped us deduce pathways such as immunodeficiency and other signaling pathways that could be potential drugs

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)