Sökning: "Levenshtein edit distance"

Visar resultat 1 - 5 av 6 uppsatser innehållade orden Levenshtein edit distance.

  1. 1. A Rule-Based Normalization System for Greek Noisy User-Generated Text

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Marsida Toska; [2020]
    Nyckelord :nlp; noisy text preprocessing; rule-based; levenshtein; twitter; normalization; Greek;

    Sammanfattning : The ever-growing usage of social media platforms generates daily vast amounts of textual data which could potentially serve as a great source of information. Therefore, mining user-generated data for commercial, academic, or other purposes has already attracted the interest of the research community. LÄS MER

  2. 2. Spell checker for a Java Application

    Kandidat-uppsats, Karlstads universitet/Institutionen för matematik och datavetenskap (from 2013)

    Författare :Arvid Viktorsson; Illya Kyrychenko; [2020]
    Nyckelord :Spellchecker; Java; Trie; edit distance; Soundex; damerau; levenshtein;

    Sammanfattning : Many text-editor users depend on spellcheckers to correct their typographical errors. The absence of a spellchecker can create a negative experience for the user. In today's advanced technological environment spellchecking is an expected feature. LÄS MER

  3. 3. Spelling Normalization of English Student Writings

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Yuchan HONG; [2018]
    Nyckelord :spelling normalization; English student writings; phonetic similarity comparison; Levenshtein edit distance; character-based statistical machine translation; character-based neural machine translation;

    Sammanfattning : Spelling normalization is the task to normalize non-standard words into standard words in texts, resulting in a decrease in out-of-vocabulary (OOV) words in texts for natural language processing (NLP) tasks such as information retrieval, machine translation, and opinion mining, improving the performance of various NLP applications on normalized texts. In this thesis, we explore different methods for spelling normalization of English student writings including traditional Levenshtein edit distance comparison, phonetic similarity comparison, character-based Statistical Machine Translation (SMT) and character-based Neural Machine Translation (NMT) methods. LÄS MER

  4. 4. Weighting Edit Distance to Improve Spelling Correction in Music Entity Search

    Master-uppsats, KTH/Skolan för datavetenskap och kommunikation (CSC)

    Författare :Axel Samuelsson; [2017]
    Nyckelord :Spelling correction; edit distance; search; music; spotify; trie; Damerau; Levenshtein;

    Sammanfattning : This master’s thesis project undertook investigation of whether the extant Damerau- Levenshtein edit distance measurement between two strings could be made more useful for detecting and adjusting misspellings in a search query. The idea was to use the knowledge that many users type their queries using the QWERTY keyboard layout, and weighting the edit distance in a manner that makes it cheaper to correct misspellings caused by confusion of nearer keys. LÄS MER

  5. 5. Offline Approximate String Matching forInformation Retrieval : An experiment on technical documentation

    Master-uppsats, JTH. Forskningsmiljö Informationsteknik

    Författare :Simon Dubois; [2013]
    Nyckelord :Algorithm comparison; Approximate string matching; Information retrieval; Offline string matching; Overlap coefficient; Phonetic indexation; String distance; String metric; String searching algorithm;

    Sammanfattning : Approximate string matching consists in identifying strings as similar even ifthere is a number of mismatch between them. This technique is one of thesolutions to reduce the exact matching strictness in data comparison. In manycases it is useful to identify stream variation (e.g. LÄS MER