Reproducing the state of the art in onset detection using neural networks

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Björn Lindqvist; [2019]

Nyckelord: ;

Sammanfattning: Great strides have been made in the state of the art performance of musicial onset detection in recent years with better and better detectors being invented at a fast pace. The current top spot is held by Schlüter and Böck, who in 2014 presented a detector based on a convolutional neural network (CNN) that attained an F-score of 90.3% (Precision 91.7%, 88.9% recall) on a commonly used dataset [1]. In 2018 two researchers, Gong and Serra, tried to replicate their result but only reached an F-score of 86.67% (precision and recall values weren’t reported) [2], a significantly worse result than Schlüter and Böck’s. In comparison a 2013 detector based on a recurrent network, also designed by Schlüter and Böck, achieved an F-score of 87.3% [3]. Gong and Serra’s result casts doubt on the 90.3% figure reported by Schlüter and Böck. We therefore try to shed some light on the question of what the state of the art performance in musical onset detection is by posing and answering the question; can Schlüter and Böck’s result be reproduced? Our answer is “Maybe – but we were unable to!” which is perhaps the only result possible since you can’t prove a negative. We trained the CNN architecture three times and obtained F-scores of 85.0%, 85.8% and 85.6%. For the RNN architecture, which we also tried to reproduce, we obtained the scores 86.3%, 86.3% and 86.3%. Due to omission of details that perhaps were significant from the referenced articles, we weren’t able to recreate Schlüter and Böck’s architectures exactly and had to make some “educated guesses.” It is possible that those guesses caused performance to suffer. Nevertheless, we believe that our work is worthwhile because it demonstrates how infuriatingly difficult it is in deep learning for researchers to reproduce each others work.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)