Sökning: "Wav2vec 2.0"

Visar resultat 1 - 5 av 6 uppsatser innehållade orden Wav2vec 2.0.

  1. 1. Domain Adaptation with N-gram Language Models for Swedish Automatic Speech Recognition : Using text data augmentation to create domain-specific n-gram models for a Swedish open-source wav2vec 2.0 model

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Viktor Enzell; [2022]
    Nyckelord :Automatic Speech Recognition; Domain Adaptation; Language Models; Ngram Models; Wav2vec2; Taligenkänning; Domänanpassning; Språkmodeller; N-gramModeller; Wav2vec2;

    Sammanfattning : Automatic Speech Recognition (ASR) enables a wide variety of practical applications. However, many applications have their own domain-specific words, creating a gap between training and test data when used in practice. LÄS MER

  2. 2. Cross-lingual and Multilingual Automatic Speech Recognition for Scandinavian Languages

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Rafal Černiavski; [2022]
    Nyckelord :cross-lingual; multilingual; automatic speech recognition; ASR;

    Sammanfattning : Research into Automatic Speech Recognition (ASR), the task of transforming speech into text, remains highly relevant due to its countless applications in industry and academia. State-of-the-art ASR models are able to produce nearly perfect, sometimes referred to as human-like transcriptions; however, accurate ASR models are most often available only in high-resource languages. LÄS MER

  3. 3. Automatisk taligenkänning som metod för att undersöka artikulationshastighet i svenska

    Kandidat-uppsats, Stockholms universitet/Institutionen för lingvistik

    Författare :Liv Martin Björkdahl; [2022]
    Nyckelord :ASR; automatic speech recognition; articulation rate; UID; dependency length; dependency minimization; information density; ASR; taligenkänning; artikulationshastighet; Wav2Vec 2.0; dependenslängd; korpusstudier; informationsdensitet; UID; dependenslängdsminimering;

    Sammanfattning : Den senaste tidens utveckling inom automatisk taligenkänning har lett till mindre resurskrävan-de och mer effektiva modeller. Detta innebär nya möjligheter för forskning kring spontant tal.I den här studien används Kungliga Bibliotekets svenska version av Wav2Vec 2. LÄS MER

  4. 4. Multilingual Speech Emotion Recognition using pretrained models powered by Self-Supervised Learning

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Felix Luthman; [2022]
    Nyckelord :Speech; Audio; Emotion Recognition; Cross-lingual; Multilingual; Self- Supervised Learning; Wav2vec 2.0; HuBERT; UniSpeech; UniSpeech-SAT; WavLM; Språk; Ljud; Känsloigenkänning; Tvärspråklig; Flerspråkig; Själv-Övervakad Inlärning; Wav2vec 2.0; HuBERT; UniSpeech; UniSpeech-SAT; WavLM;

    Sammanfattning : Society is based on communication, for which speech is the most prevalent medium. In day to day interactions we talk to each other, but it is not only the words spoken that matters, but the emotional delivery as well. Extracting emotion from speech has therefore become a topic of research in the area of speech tasks. LÄS MER

  5. 5. Automatic Speech Recognition for low-resource languages using Wav2Vec2 : Modern Standard Arabic (MSA) as an example of a low-resource language

    Master-uppsats, Högskolan Dalarna/Institutionen för information och teknik

    Författare :Taha Zouhair; [2021]
    Nyckelord :Automatic Speech Recognition; Facebook Wav2Vec; Mozilla Common Voice; Low-Resource Language;

    Sammanfattning : The need for fully automatic translation at DigitalTolk, a Stockholm-based company providing translation services, leads to exploring Automatic Speech Recognition as a first step for Modern Standard Arabic (MSA). Facebook AI recently released a second version of its Wav2Vec models, dubbed Wav2Vec 2. LÄS MER