Sökning: "Low-resource Languages"

Visar resultat 1 - 5 av 28 uppsatser innehållade orden Low-resource Languages.

  1. 1. A Comparative Analysis of Whisper and VoxRex on Swedish Speech Data

    Kandidat-uppsats, Uppsala universitet/Statistiska institutionen

    Författare :Max Fredriksson; Elise Ramsay Veljanovska; [2024]
    Nyckelord :ASR; Automatic Speech Recognition; Swedish Speech Recognition; Speech Recognition Models; Speech-to-Text; Whisper; VoxRex; Wav2Vec; Model Comparison; Transformer Models; Neural Networks; Machine Learning; WER; Word Error Rate; Transcription;

    Sammanfattning : With the constant development of more advanced speech recognition models, the need to determine which models are better in specific areas and for specific purposes becomes increasingly crucial. Even more so for low-resource languages such as Swedish, dependent on the progress of models for the large international languages. LÄS MER

  2. 2. Data Augmentation: Enhancing Named Entity Recognition Performance on Swedish Medical Texts

    Master-uppsats, Göteborgs universitet/Institutionen för data- och informationsteknik

    Författare :Lucas Rosvall; Niklas Paasonen; [2023-10-05]
    Nyckelord :Machine Learning; Information Extraction; Named Entity Recognition; BERT; Data Augmentation;

    Sammanfattning : Named Entity Recognition (NER) refers to the task of locating relevant information within text sequences. Within the medical domain, it can benefit applications such as de-identifying patient records or extracting valuable data for other downstream tasks. LÄS MER

  3. 3. How negation influences word order in languages : Automatic classification of word order preference in positive and negative transitive clauses

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Chen Lyu; [2023]
    Nyckelord :;

    Sammanfattning : In this work, we explore the possibility of using word alignment in parallel corpus to project language annotations such as Part-of-Speech tags and dependency relation from high-resource languages to low-resource languages. We use a parallel corpus of Bible translations, including 1,444 translations in 986 languages, and a well-developed parser is used to annotate source languages (English, French, German, and Czech). LÄS MER

  4. 4. Head-to-head Transfer Learning Comparisons made Possible : A Comparative Study of Transfer Learning Methods for Neural Machine Translation of the Baltic Languages

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Mathias Stenlund; [2023]
    Nyckelord :machine translation; transfer learning; Latvian; Lithuanian; low-resource languages; transformers; parent language; child language; comparative study;

    Sammanfattning : The struggle of training adequate MT models using data-hungry NMT frameworks for low-resource language pairs has created a need to alleviate the scarcity of sufficiently large parallel corpora. Different transfer learning methods have been introduced as possible solutions to this problem, where a new model for a target task is initialized using parameters learned from some other high-resource task. LÄS MER

  5. 5. Neural maskinöversättning av gawarbati

    Kandidat-uppsats, Stockholms universitet/Avdelningen för datorlingvistik

    Författare :Katarina Gillholm; [2023]
    Nyckelord :Machine translation; neural machine translation; NMT; low resource language; Gawarbati; transfer learning; GPT; Maskinöversättning; neural maskinöversättning; NMT; lågresursspråk; gawarbati; överföringsinlärning; GPT;

    Sammanfattning : Nya neurala modeller har lett till stora framsteg inom maskinöversättning, men fungerar fortfarande sämre på språk som saknar stora mängder parallella data, så kallade lågresursspråk. Gawarbati är ett litet, hotat lågresursspråk där endast 5000 parallella meningar finns tillgängligt. LÄS MER