A study about Active Semi-Supervised Learning for Generative Models

Detta är en Master-uppsats från Linköpings universitet/Institutionen för datavetenskap

Sammanfattning: In many relevant scenarios, there is an imbalance between abundant unlabeled data and scarce labeled data to train predictive models. Semi-Supervised Learning and Active Learning are two distinct approaches to deal with this issue. The first one directly uses the unlabeled data to improve model parameter learning, while the second performs a smart choice of unlabeled points to be sent to an annotator, or oracle, which can label these points and increase the labeled training set. In this context, Generative Models are highly appropriate, since they internally represent the data generating process, naturally benefiting from data samples independently of the presence of labels. This Thesis proposes Expectation-Maximization with Density-Weighted Entropy, a novel active semi-supervised learning framework tailored towards generative models. The method is theoretically explored and experiments are conducted to evaluate its application to Gaussian Mixture Models and Multinomial Mixture Models. Based on its partial success, several questions are raised and discussed as to identify possible improvements and decide which shortcomings need to be dealt with before the method is considered robust and generally applicable.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)