  1. 1. Exploit Unlabeled Data with Language Model for Text Classification. Comparison of four unsupervised learning models

    Master-uppsats, Göteborgs universitet/Institutionen för filosofi, lingvistikoch vetenskapsteori

    Författare :Sung-Min Yang; [2018-10-29]
    Nyckelord :Text classification; Semi-supervised learning; Unsupervised learning; Transfer learning; Natural Language Processing;

    Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this paper shows that Language Model (LM) outperforms the three models in text classification, which three models are based on Term-Frequency Inverse Document Frequency (Tf-idf) and two pre-trained word vectors. The experimental results show that the LM outperforms the other three unsupervised learning models whether the task is easy or difficult, which the difficult task consists of imbalanced data.

  2. 2. Investigating Machine Learning Clustering Methods to Replicate the Human Idea of Structure to Documents

    Master-uppsats, Lunds universitet/Matematik LTH

    Författare :Johannes Jansson; Victor Miller; [2018]
    Nyckelord :machine learning; k-means; support vector machine; svm; tf-idf; clustering; document; documents; pdf; information retrieval; scikit-learn; Mathematics and Statistics;

    Anyone trying to maintain a set of text documents in an information retrieval system will run into problems keeping it relevant and up to date as the amount of data increases. This thesis investigates how a collection of documents can be clustered in a way that resembles how a human would organize it.

  3. 3. Inflated Multinomial Matching for Anchor-Free Object Detection

    Master-uppsats, Lunds universitet/Matematik LTH

    Författare :Cesar Hiersemann; [2018]
    Nyckelord :Computer Vision; Machine Learning; Deep Learning; Convolutional Neural Networks; Object Recognition; Object Detection; Anchors; Anchor Boxes; Mathematics and Statistics;

    This thesis presents a novel matching strategy, Inflated Multinomial Matching, which enables training of anchor-free object detection models based on convolutional neural networks. An important aspect of detection models is the integral usage of anchor boxes, where an anchor box is a bounding box with a preset and constant location, size and shape in the image.

  4. 4. Unsupervised Machine Learning: An Investigation of Clustering Algorithms on a Small Dataset

    Kandidat-uppsats, Blekinge Tekniska Högskola/Institutionen för programvaruteknik; Blekinge Tekniska Högskola/Institutionen för programvaruteknik

    Författare :Fredrik Forsberg; Pierre Alvarez Gonzalez; [2018]
    Nyckelord :;

    Context: With the rising popularity of machine learning, looking at its shortcomings is valuable in seeing how well machine learning is applicable. Is it possible to apply the clustering with a small dataset? Objectives: This thesis consists of a literature study, a survey and an experiment.

  5. 5. Forecasting High Yield Corporate Bond Industry Excess Return

    Master-uppsats, KTH/Matematisk statistik

    Författare :Carlos Junior Lopez Vydrin; [2018]
    Nyckelord :;

    In this thesis, we apply unsupervised and supervised statistical learning methods on the high-yield corporate bond market with the goal of predicting its future excess return. We analyse the excess return of industry based indices of high-yield corporate bonds belonging to the Chemical, Metals, Paper, Building Materials, Packaging, Telecom, and Electric Utility industry.