Sökning: "Text classification"

Visar resultat 1 - 5 av 118 uppsatser innehållade orden Text classification.

  1. 1. Exploit Unlabeled Data with Language Model for Text Classification. Comparison of four unsupervised learning models

    Master-uppsats, Göteborgs universitet/Institutionen för filosofi, lingvistikoch vetenskapsteori

    Författare :Sung-Min Yang; [2018-10-29]
    Nyckelord :Text classification; Semi-supervised learning; Unsupervised learning; Transfer learning; Natural Language Processing;

    Sammanfattning : Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this paper shows that Language Model (LM) outperforms the three models in text classification, which three models are based on Term-Frequency Inverse Document Frequency (Tf-idf) and two pre-trained word vectors. The experimental results show that the LM outperforms the other three unsupervised learning models whether the task is easy or difficult, which the difficult task consists of imbalanced data. LÄS MER

  2. 2. An Object-Oriented Data Analysis approach for text population

    Master-uppsats, KTH/Matematisk statistik

    Författare :Joffrey Dumont-Le Brazidc; [2018]
    Nyckelord :;

    Sammanfattning : With more and more digital text-valued data available, the need to be able to cluster, classify and study them arises. We develop in this thesis statistical tools to perform null hypothesis testing and clustering or classification on text-valued data in the framework of Object-Oriented Data Analysis. LÄS MER

  3. 3. Analysis of Short Text Classification strategies using Out of-domain Vocabularies

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Diego Roa; [2018]
    Nyckelord :;

    Sammanfattning : Short text classification has become an important task for the Natural Language Processing (NLP) community due to the rapidly growing amount of tweets, search queries, short reviews and descriptions in different contexts such as e-commerce, social media and internal Enterprise Resource Planning (ERP) systems. The brevity and sparsity of such text data represent challenges to build accurate classification models. LÄS MER

  4. 4. Automatic Classification of text regarding Child Sexual Abusive Material

    Uppsats för yrkesexamina på avancerad nivå, Uppsala universitet/Avdelningen för systemteknik

    Författare :Emil Fleron; [2018]
    Nyckelord :Deep Learning; Machine Learning; Artificial Neural Network; Djupinlärning; Maskininlärning; Artificiella neurala nätverk;

    Sammanfattning : Sexual abuse is a horrible reality for many children around the world. As technology improves the availability of encryption schemes and anonymity over the internet, the perpetrators of these acts are increasingly hard to track. LÄS MER

  5. 5. Text feature mining using pre-trained word embeddings

    Master-uppsats, KTH/Matematisk statistik

    Författare :Henrik Sjökvist; [2018]
    Nyckelord :Word embeddings; Feature engineering; Unsupervised learning; Deep learning; fast Text; Operational risk; Ordvektorer; Attributgenerering; Oövervakat lärande; Djupinlärning; fastText; Operativ risk;

    Sammanfattning : This thesis explores a machine learning task where the data contains not only numerical features but also free-text features. In order to employ a supervised classifier and make predictions, the free-text features must be converted into numerical features.  In this thesis, an algorithm is developed to perform that conversion. LÄS MER