Audio effect emulation with Neural Networks

Detta är en Kandidat-uppsats från KTH/Skolan för datavetenskap och kommunikation (CSC)

Författare: Omar Del Tejo Catalá; Luis Masiá Fuster; [2017]

Nyckelord: ;

Sammanfattning: This paper discusses if using Neural Networks we can develop model which emulates audio effects and also if it can stand up to traditional audio effect emulators. This report includes the comparison of the performance between Recurrent Neural Networks such as Long Short Term Memory and Gated Recurrent Unit, and also Convolutional Neural Networks. This paper also checks if the best performing network, dealing with a online stream of inputs, can produce its outputs without a significant delay, as the ones of traditional audio effect emulators.   The networks were trained to emulate an EQ effect. The results compared the audio produced by the network with the audio we want the network to produce, which is the audio modified by the EQ. These results were compared quantitatively, calculating the absolute difference between the two audio and comparing the frequency spectrum; and qualitatively, checking if people could hear both audios as the same one.   Long Short Term Memory turned out to be the ones which achieved the best results. However, they could not produce a stream of outputs without a significant delay nor an acceptable error.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)