Sökning: "Statistical Machine Translation"

Visar resultat 1 - 5 av 14 uppsatser innehållade orden Statistical Machine Translation.

  1. 1. Syntax-based Concept Alignment for Machine Translation

    Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknik

    Författare :Arianna Masciolini; [2023-03-30]
    Nyckelord :computational linguistic; machine translation; concept alignment; syntax; dependency parsing; Universal Dependencies; Grammatical Framework;

    Sammanfattning : This thesis presents a syntax-based approach to Concept Alignment (CA), the task of finding semantical correspondences between parts of multilingual parallel texts, with a focus on Machine Translation (MT). Two variants of CA are taken into account: Concept Extraction (CE), whose aim is to identify new concepts by means of mere linguistic comparison, and Concept Propagation (CP), which consists in looking for the translation equivalents of a set of known concepts in a new language. LÄS MER

  2. 2. Neural maskinöversättning av gawarbati

    Kandidat-uppsats, Stockholms universitet/Avdelningen för datorlingvistik

    Författare :Katarina Gillholm; [2023]
    Nyckelord :Machine translation; neural machine translation; NMT; low resource language; Gawarbati; transfer learning; GPT; Maskinöversättning; neural maskinöversättning; NMT; lågresursspråk; gawarbati; överföringsinlärning; GPT;

    Sammanfattning : Nya neurala modeller har lett till stora framsteg inom maskinöversättning, men fungerar fortfarande sämre på språk som saknar stora mängder parallella data, så kallade lågresursspråk. Gawarbati är ett litet, hotat lågresursspråk där endast 5000 parallella meningar finns tillgängligt. LÄS MER

  3. 3. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Giuseppe Della Corte; [2020]
    Nyckelord :speech translation; parallel corpora; bilingual sentence alignment; sentence embeddings; cosine similarity; forced alignment; text collection; corpora creation; audio signal processing;

    Sammanfattning : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. LÄS MER

  4. 4. Spelling Normalization of English Student Writings

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Yuchan HONG; [2018]
    Nyckelord :spelling normalization; English student writings; phonetic similarity comparison; Levenshtein edit distance; character-based statistical machine translation; character-based neural machine translation;

    Sammanfattning : Spelling normalization is the task to normalize non-standard words into standard words in texts, resulting in a decrease in out-of-vocabulary (OOV) words in texts for natural language processing (NLP) tasks such as information retrieval, machine translation, and opinion mining, improving the performance of various NLP applications on normalized texts. In this thesis, we explore different methods for spelling normalization of English student writings including traditional Levenshtein edit distance comparison, phonetic similarity comparison, character-based Statistical Machine Translation (SMT) and character-based Neural Machine Translation (NMT) methods. LÄS MER

  5. 5. Attitydanalys av svenska produktomdömen – behövs språkspecifika verktyg?

    Kandidat-uppsats, Stockholms universitet/Institutionen för lingvistik

    Författare :Oliver Glant; [2018]
    Nyckelord :Computational linguistics; machine translation; neural network; product review; sentiment analysis; Attitydanalys; datorlingvistik; maskinöversättning; neuronnät; produktomdömen;

    Sammanfattning : Sentiment analysis of Swedish data is often performed using English tools and machine. This thesis compares using a neural network trained on Swedish data with a corresponding one trained on English data. LÄS MER