Data augmentation and related opportunity cost for managing the contemporary data sparsity

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Ali Muquri; Staffan Konstholm; [2021]

Nyckelord: ;

Sammanfattning: This paper explored data augmentation as an alternative solution to the supervised data sparsity that has become a deeply rooted issue in machine learning projects. Convolutional Neural Network models with architecture resembling the ResNet Neural Network were trained on an augmented version of the CIFAR-10 dataset. Safe augmentation types were revealed from the evaluation, accuracy measurements, of these models. Moreover, the paper continued with a direct cost only approach to grasp the benefit of these safe data augmentations. Opportunity cost of data labeling services, with prices and tiers collected from different companies, showed that data augmentation can save thousands of dollars(USD) in project costs. Overall, this study shed a much needed light on the quantitative value of dealing with data sparsity within machine learning projects, deeming data augmentation both practical and financially sensible. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)