Performance benchmarks of lip-syncscripting in Maya using speechrecognition : Gender bias and speech recognition

Detta är en Kandidat-uppsats från Blekinge Tekniska Högskola/Institutionen för datavetenskap

Författare: Adrian Björkholm; [2022]

Nyckelord: Lip-syncing; speech recognition; Animation; Maya; Python;

Sammanfattning: Background: Automated lip sync is used in animation to make facial animations with a minimal interception from an animator. A lip-syncing script for Maya has been written in Python using the Vosk API to transcribe voice lines from audio files into instructions in Maya to automate the pipeline for speech animations. Previous studies have mentioned that some voice transcription and voice recognition API's have had a gender bias that does not read female voices as efficiently as male voices. Does gender affect this lip-syncing script's performance in creating animations? Objectives: Benchmark the performance of a lip-syncing script that uses voice transcription by looking for a gender bias in a voice transcription API by comparing male and female voices as input. If there is a gender bias, how much does it affect the produced animations? Methods: Evaluating the script's perceived performance by conducting a user study through a questionnaire. The Participants evaluate different animation attributes to build an image of a potentially perceived gender bias in the script. Analyzing the transcribed voice lines for an objective view of a possible gender bias. Results: The transcribed voice lines were almost perfect on both male and female vocal lines, with just one transcription error for one word in one of the male voiced lines. The male and female voiced lines received very similar grading on their voice lines when analyzing the data from the questionnaire. On average, the male voice lines seemed to get a higher rating on most voice lines in the different criteria, but the score difference was minimal. Conclusions: There is no gender bias in the lip syncing script. The accuracy experiment had a very similar accuracy rate between the male and female vocal lines. The female-voiced lines received a slightly higher accuracy than the male voice lines with the difference in one word. The male voice lines received a slightly higher score on the perceived scores through the questionnaire. The males had a higher score because of other factors than a possible gender bias.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)