Identifying esophageal atresi associated variants from whole genome sequencing data

Detta är en Uppsats för yrkesexamina på avancerad nivå från Uppsala universitet/Institutionen för immunologi, genetik och patologi

Författare: Jonas Mattisson; [2018]

Nyckelord: ;

Sammanfattning: Knowing the underlying cause of a genetic disorder could not only further our understanding of the disease itself, and the otherwise healthy mechanism that is disrupted. It could potentially improve people’s lives. Even if whole genome sequencing has drastically improved the potential of discovering the cause, a comparison of two non-related individual’s genome will find several million sequence variations. While most variants have no significant impact, it is enough for only one to functionally impact a gene, for it to cause a genetic disorder. This project therefore focused on the filtering of variants, from lists of several million possible causes, to the stage where they could feasible be manually analysed one by one. Single-nucleotide variants, indels and structural variants were filtered, based on a dataset where single-nucleotide variants and indels had already been called. The more difficult process of structural variants discovery was performed, but it required the application of four different tools to minimise the drawback of each separate discovery technique. The same three filtering approaches were applied to all variants; the intersecting of datasets that should contain the same variant, the removal of variants in common with the general population and the selection of variants impacting functionality. Each approach proved to be an efficient filtering step, with their combination reducing each list to only a couple of variants out of the original five million. Due to lower accuracy and sensitivity of the structural variant analysis, this data will likely require more extensive manual analysis.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)