Sökning: "lemmatization"

Visar resultat 1 - 5 av 11 uppsatser innehållade ordet lemmatization.

  1. 1. IŻ SWÓJ JĘZYK MAJĄ! An exploration of the computational methods for identifying language variation in Polish

    Master-uppsats, Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori

    Författare :Maria Irena Szawerna; [2023-06-19]
    Nyckelord :language variation; Polish; diachronic linguistics; part-of-speech tagging; lemmatization; corpus linguistics;

    Sammanfattning : Computational approaches to language variation continue to contribute in a relevant way to various fields, including Natural Language Processing (NLP) and linguistics. Being able to accommodate variation within natural language increases the robustness of NLP models and their usefulness in real-life applications; simultaneously, detecting and describing variation and trends that govern it is one of the main goals of sociolinguistics and historical linguistics, meaning that some of the advances in NLP can contribute to these fields as well. LÄS MER

  2. 2. Evaluating the robustness of DistilBERT to data shift in toxicity detection

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Caroline Larsen; [2022]
    Nyckelord :Machine learning; Natural Language Processing; DistilBERT; Toxicity Detection; Profanity Detection; Hate Speech Identification; Text preprocessing; Maskininlärning; naturligtspråkbehandling; DistilBERT; identifiering av kränkande språk; identifiering av svordomar; textbehandling;

    Sammanfattning : With the rise of social media, cyberbullying and online spread of hate have become serious problems with devastating consequences. Mentimeter is an interactive presentation tool enabling the presentation audience to participate by typing their own answers to questions asked by the presenter. LÄS MER

  3. 3. Comparing and contrasting the dissemination cascades of different topics in a social network : What are the lifetimes of different topics and how do they spread

    Kandidat-uppsats, Linköpings universitet/Institutionen för datavetenskap

    Författare :Linus Käll; Simon Pertoft; [2021]
    Nyckelord :spreading; dissemination; topics; topic; modeling; modelling; lda; fake; false; news; social; network;

    Sammanfattning : The web has granted everyone the opportunity to freely share large amounts of data. Individuals, corporations, and communities have made the web an important tool in their arsenal. These entities are spreading information online, but not all of it is constructive. LÄS MER

  4. 4. Implementation of an abstract module for entity resolution to combine data sources with the same domain information

    Magister-uppsats, Luleå tekniska universitet/Institutionen för system- och rymdteknik

    Författare :Ziaul Islam Chowdhury; [2021]
    Nyckelord :;

    Sammanfattning : Increasing digitalization is creating a lot of data every day. Sometimes the same real-world entity is stored in multiple data sources but lacks common reference. This creates a significant challenge on the integration of data sources and may cause duplicates and inconsistencies if not resolved correctly. LÄS MER

  5. 5. Computational Analysis of Swedish Newspapers  Using Topic Detection and Sentiment Analysis

    Kandidat-uppsats, Uppsala universitet/Institutionen för informationsteknologi

    Författare :Simon Wallbing; [2021]
    Nyckelord :;

    Sammanfattning : Newspapers might report on the same event, say a sport event or a political statement, but since they most likely differ in the presentation, are the content and under laying message of the articles actually the same? A human can read two separate articles and determine if they touch similar subjects and if they approach the subject in a positive or negative way. If this comparison would be preformed over several thousand of articles a computer would very much be the preferred method. LÄS MER