Optimization of Heterogeneous Parallel Computing Systems using Machine Learning

Detta är en Kandidat-uppsats från Blekinge Tekniska Högskola/Institutionen för datavetenskap

Sammanfattning: Background: Heterogeneous parallel computing systems utilize the combination of different resources CPUs and GPUs to achieve high performance and, reduced latency and energy consumption. Programming applications that target various processing units requires employing different tools and programming models/languages. Furthermore, selecting the most optimal implementation, which may either target different processing units (i.e. CPU or GPU) or implement the various algorithms, is not trivial for a given context. In this thesis, we investigate the use of machine learning to address the selection problem of various implementation variants for an application running on a heterogeneous system. Objectives: This study is focused on providing an approach for optimization of heterogeneous parallel computing systems at runtime by building the most efficient machine learning model to predict the optimal implementation variant of an application. Methods: The six machine learning models KNN, XGBoost, DTC, Random Forest Classifier, LightGBM, and SVM are trained and tested using stratified k-fold on the dataset generated from the matrix multiplication application for square matrix input dimension ranging from 16x16 to 10992x10992. Results: The results of each machine learning algorithm’s finding are presented through accuracy, confusion matrix, classification report for parameters precision, recall, and F-1 score, and a comparison between the machine learning models in terms of accuracy, run-time training, and run-time prediction are provided to determine the best model. Conclusions: The XGBoost, DTC, SVM algorithms achieved 100% accuracy. In comparison to the other machine learning models, the DTC is found to be the most suitable due to its low time required for training and prediction in predicting the optimal implementation variant of the heterogeneous system application. Hence the DTC is the best suitable algorithm for the optimization of heterogeneous parallel computing.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)