Achieving Automatic Speech Recognition for Swedish using the Kaldi toolkit

Detta är en Master-uppsats från KTH/Skolan för datavetenskap och kommunikation (CSC)

Författare: Zimon Mossberg; [2016]

Nyckelord: ASR; Kaldi; NST; GMM-HMM;

Sammanfattning: The meager offering of online commercial Swedish Automatic Speech Recognition ser-vices prompts the effort to develop a speech recognizer for Swedish using the open sourcetoolkit Kaldi and publicly available NST speech corpus. Using a previous Kaldi recipeseveral GMM-HMM models are trained and evaluated against commercial options toallow for reasoning of the performance of a customized solution for Automatic SpeechRecognition to that of commercial services. The evaluation takes both accuracy andcomputational speed into consideration. Initial results of the evaluation indicate a sys-tematic bias in the selected test set confirmed by a follow up investigative evaluation.The conclusion is that building a speech recognizer for Swedish using the NST corpusand Kaldi without expert knowledge is feasible but requires further work.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)