Illustrations of Data Analysis Using the Mapper Algorithm and Persistent Homology

Detta är en Master-uppsats från KTH/Matematik (Avd.)

Författare: Rami Kraft; [2016]

Nyckelord: ;

Sammanfattning: The Mapper algorithm and persistent homology are topological data analysis tools used for analyzing point cloud data. In addition a classification method is used as a part of the data analysis toolchain adopted in this thesis in order to serve as a distinguishing technique for two class labels. This thesis has two major goals; the first goal is to present persistent homology and the Mapper algorithm as two techniques by which shapes, mostly point clouds sampled from shapes of known topology can be identified and visualized even though in some cases noise is being there. We then provide some illustrative examples in the form of barcodes, persistence diagrams and topological network models for several point cloud data. The second goal is to propose an approach for extracting useful insights from point cloud data based on the use of Mapper and a classification technique known as the penalized logistic regression. We then provide two real-world datasets for which both continuous and categorical responses are considered. We show that it is very advantageous to apply a topological mapping tool such as the Mapper algorithm on a dataset as a pre-processing organizing step before using a classification technique. We finally show that the Mapper algorithm not only allows for visualizing point cloud data but also allows for detecting possible flarelike shapes that are present in the shape of the data. Those detected flares are given class labels and the classification task at that point is to distinguish one from the other in order to discover relationships between variables in such a way that allows for generalizing those relationships to hold on previously unseen data.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)