Avancerad sökning

Visar resultat 1 - 5 av 100 uppsatser som matchar ovanstående sökkriterier.

  1. 1. A Comparative Analysis of Whisper and VoxRex on Swedish Speech Data

    Kandidat-uppsats, Uppsala universitet/Statistiska institutionen

    Författare :Max Fredriksson; Elise Ramsay Veljanovska; [2024]
    Nyckelord :ASR; Automatic Speech Recognition; Swedish Speech Recognition; Speech Recognition Models; Speech-to-Text; Whisper; VoxRex; Wav2Vec; Model Comparison; Transformer Models; Neural Networks; Machine Learning; WER; Word Error Rate; Transcription;

    Sammanfattning : With the constant development of more advanced speech recognition models, the need to determine which models are better in specific areas and for specific purposes becomes increasingly crucial. Even more so for low-resource languages such as Swedish, dependent on the progress of models for the large international languages. LÄS MER

  2. 2. Analyzing the Influence of Synthetic andAugmented Data on Segmentation Model

    Uppsats för yrkesexamina på avancerad nivå, Luleå tekniska universitet/Institutionen för system- och rymdteknik

    Författare :Alex Peschel; [2023]
    Nyckelord :Artificial Intelligence; Microorganisms; Segmentation; Synthesizing; Augmentation;

    Sammanfattning : The field of Artificial Intelligence (AI) has experienced unprecedented growth in recent years, thanks to the numerous applications related to speech recognition, natural language processing, and computer vision. However, one of the challenges facing AI is the requirement for large amounts of energy, time, and data to be effective and accurate. LÄS MER

  3. 3. aiLangu - Real-time Transcription and Translation to Reduce Language Barriers : An Engineering Project to Develop an Application for Enhancing Human Verbal Communication

    Kandidat-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Vincent Ringström1; Iley Alvarez Funcke; [2023]
    Nyckelord :Sound transcription; Sound translation; AI; Deep learning; Real-time; Language barrier; Concurrency; Ljud transkription; Ljud översättning; AI; Djupinlärning; Real-tid; Språkbarriär; Samtidighet;

    Sammanfattning : The research area this report relates to is real-time automatic transcription and translation. The purpose of the work done for the report is to reduce the perceived language barriers online and to make a user-friendly application to make use of the latest deep learning technology to transcribe and translate in real-time. LÄS MER

  4. 4. Analysis of speaking time and content of the various debates of the presidential campaign : Automated AI analysis of speech time and content of presidential debates based on the audio using speaker detection and topic detection

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Axel Valentin Maza; [2023]
    Nyckelord :Artificial Intelligence; Speaker detection; Speaker recognition; Speaker diarization; Speaker identification; Debate; Politics; Deep Learning; Artificiell intelligens; talardetektion; talarigenkänning; talardiarisering; talaridentifiering; debatt; politik; djupinlärning;

    Sammanfattning : The field of artificial intelligence (AI) has grown rapidly in recent years and its applications are becoming more widespread in various fields, including politics. In particular, presidential debates have become a crucial aspect of election campaigns and it is important to analyze the information exchanged in these debates in an objective way to let voters choose without being influenced by biased data. LÄS MER

  5. 5. Towards End-User Understanding: Exploring Explanations For Profanity Detection

    Master-uppsats, Umeå universitet/Institutionen för datavetenskap

    Författare :Noah Öberg; [2023]
    Nyckelord :;

    Sammanfattning : Current text classification models can accurately identify instances of specific categories, such as hate speech or bad language, but they often don’t provide clear explanations to the end user for their decisions. This can lead to confusion or mistrust in the results, especially in sensitive applications where the consequences of misclassification can be significant. LÄS MER