Sökning: "automatisk taligenkänning"
Visar resultat 1 - 5 av 20 uppsatser innehållade orden automatisk taligenkänning.
1. Identification and Classification of TTS Intelligibility Errors Using ASR : A Method for Automatic Evaluation of Speech Intelligibility
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : In recent years, applications using synthesized speech have become more numerous and publicly available. As the area grows, so does the need for delivering high-quality, intelligible speech, and subsequently the need for effective methods of assessing the intelligibility of synthesized speech. LÄS MER
2. Mispronunciation Detection with SpeechBlender Data Augmentation Pipeline
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : The rise of multilingualism has fueled the demand for computer-assisted pronunciation training (CAPT) systems for language learning, CAPT systems make use of speech technology advancements and offer features such as learner assessment and curriculum management. Mispronunciation detection (MD) is a crucial aspect of CAPT, aimed at identifying and correcting mispronunciations in second language learners’ speech. LÄS MER
3. Generation of Control Logic from Ordinary Speech
Kandidat-uppsats, Högskolan i Halmstad/Akademin för informationsteknologiSammanfattning : Developments in automatic code generation are evolving remarkably fast, with companies and researchers competing to reach human-level accuracy and capability. Advancements in this field primarily focus on using machine learning models for end-to-end code generation. LÄS MER
4. En utvärdering av tjänster för taligenkänning och textsammanfattning och möjligheter att skapa undertexter i filmer.
M1-uppsats, KTH/Hälsoinformatik och logistikSammanfattning : Att skapa undertexter till filmer är idag ett hantverk som är en tidskrävande process. Företaget Firstlight Media textar cirka 200 filmer per vecka helt manuellt och var av en film tar cirka 4–6 timmar att färdigställa. Skulle man kunna automatisera delar av processen för att undertexta filmer finns det möjlighet att spara resurser. LÄS MER
5. Domain Adaptation with N-gram Language Models for Swedish Automatic Speech Recognition : Using text data augmentation to create domain-specific n-gram models for a Swedish open-source wav2vec 2.0 model
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Automatic Speech Recognition (ASR) enables a wide variety of practical applications. However, many applications have their own domain-specific words, creating a gap between training and test data when used in practice. LÄS MER