Sökning: "Speech segmentation"

Visar resultat 1 - 5 av 14 uppsatser innehållade orden Speech segmentation.

  1. 1. Analyzing the Influence of Synthetic andAugmented Data on Segmentation Model

    Uppsats för yrkesexamina på avancerad nivå, Luleå tekniska universitet/Institutionen för system- och rymdteknik

    Författare :Alex Peschel; [2023]
    Nyckelord :Artificial Intelligence; Microorganisms; Segmentation; Synthesizing; Augmentation;

    Sammanfattning : The field of Artificial Intelligence (AI) has experienced unprecedented growth in recent years, thanks to the numerous applications related to speech recognition, natural language processing, and computer vision. However, one of the challenges facing AI is the requirement for large amounts of energy, time, and data to be effective and accurate. LÄS MER

  2. 2. Analysis of speaking time and content of the various debates of the presidential campaign : Automated AI analysis of speech time and content of presidential debates based on the audio using speaker detection and topic detection

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Axel Valentin Maza; [2023]
    Nyckelord :Artificial Intelligence; Speaker detection; Speaker recognition; Speaker diarization; Speaker identification; Debate; Politics; Deep Learning; Artificiell intelligens; talardetektion; talarigenkänning; talardiarisering; talaridentifiering; debatt; politik; djupinlärning;

    Sammanfattning : The field of artificial intelligence (AI) has grown rapidly in recent years and its applications are becoming more widespread in various fields, including politics. In particular, presidential debates have become a crucial aspect of election campaigns and it is important to analyze the information exchanged in these debates in an objective way to let voters choose without being influenced by biased data. LÄS MER

  3. 3. Swedish Language End-to-End Automatic Speech Recognition for Media Monitoring using Deep Learning

    Uppsats för yrkesexamina på avancerad nivå, Luleå tekniska universitet/Institutionen för system- och rymdteknik

    Författare :Hector Nyblom; [2022]
    Nyckelord :Automatic Speech Recognition; Deep Learning; Machine Learning; Natural Language Processing; Media Monitoring;

    Sammanfattning : In order to extract relevant information from speech recordings, the general approach is to first convert the audio into transcribed text. The text can then be analysed using well researched methods. NewsMachine AB provides customers with an overview of how they are represented in media by analysing articles in text form. LÄS MER

  4. 4. Automatic Podcast Chapter Segmentation : A Framework for Implementing and Evaluating Chapter Boundary Models for Transcribed Audio Documents

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Adam Feldstein Jacobs; [2022]
    Nyckelord :Machine Learning; Natural Language Processing; Speech Technology; Deep Learning; Podcast Segmentation; Maskininlärning; Språkteknologi; Djupinlärning; Podcast Segmentation;

    Sammanfattning : Podcasts are an exponentially growing audio medium where useful and relevant content should be served, which requires new methods of information sorting. This thesis is the first to look into the state-of-art problem of segmenting podcasts into chapters (structurally and topically coherent sections). LÄS MER

  5. 5. Automatic Annotation of Speech: Exploring Boundaries within Forced Alignment for Swedish and Norwegian

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Klaudia Biczysko; [2022]
    Nyckelord :forced alignment; automatic speech recognition; ASR; natural language processing; under-resourced languages; Swedish; Norwegian; CTC segmentation; wav2vec2; kaldi; HTK; dynamic time warping;

    Sammanfattning : In Automatic Speech Recognition, there is an extensive need for time-aligned data. Manual speech segmentation has been shown to be more laborious than manual transcription, especially when dealing with tens of hours of speech. LÄS MER