  1. 1. Text analysis for email multi label classification

    Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknik

    Författare :Kyriaki Paniskaki; Sanjit Harsha Kadam; [2020-07-08]
    Nyckelord :natural language processing; machine learning; multi label text classification; deep neural networks; bilingual texts; emails; short texts;

    This master's thesis studies a multi label text classification task on a small dataset of bilingual, English and Swedish, short texts (emails). Specifically, the size ofthe data set is 5800 emails and those emails are distributed among 107 classes withthe special case that the majority of the emails includes the two languages at thesame time.

  2. 2. Multi-Label Text Classification with Transfer Learning for Policy Documents : The Case of the Sustainable Development Goals

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Samuel Rodríguez Medina; [2019]
    Nyckelord :machine learning; deep neural networks; transfer learning; text classification; sustainable development goals; sdgs;

    We created and analyzed a text classification dataset from freely-available web documents from the United Nation's Sustainable Development Goals. We then used it to train and compare different multi-label text classifiers with the aim of exploring the alternatives for methods that facilitate the search of information of this type of documents.

  3. 3. Multilabel text classification of public procurements using deep learning intent detection

    Master-uppsats, KTH/Matematisk statistik

    Författare :Adin Suta; [2019]
    Nyckelord :Natural language processing; text classification; deep learning; applied mathematics; recurrent neural network; word embedding; Maskininlärning; textklassificering; artificiella neruonnät; tillämpad matematik;

    Textual data is one of the most widespread forms of data and the amount of such data available in the world increases at a rapid rate. Text can be understood as either a sequence of characters or words, where the latter approach is the most common.