Generation of a metrical grid informed by Deep Learning-based beat estimation in jazz-ensemble recordings

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: This work uses a Deep Learning architecture, specifically a state-of-the-art Temporal Convolutional Network, to track the beat and downbeat positions in jazz-ensemble recordings to derive their metrical grid. This network architecture has been used successfully for general beat tracking purposes. However, the jazz genre presents difficulties for this Music Information Retrieval sub-task due to its inherent complexity, and there is a lack of dedicated sets for evaluating a model’s beat tracking performance for different playstyles of this specific music genre. We present a methodology in which we trained a PyTorch implementation of the original architecture with a recalculated binary cross-entropy loss that helps boost the model’s performance compared to a standard trained version. In addition, we retrained these two models using source-separated drums and bass tracks from jazz recordings to improve performance. We further improved the model’s performance by calibrating rhythm parameters using a priori knowledge that narrows the model’s prediction range. Finally, we proposed a novel jazz dataset comprised of recordings from the same jazz piece played with different styles and used this to evaluate the performance of this methodology. We also evaluate a novel sample with tempo variations to demonstrate the architecture’s versatility. This methodology, or parts of it, can be exported to other research work and music information tools that perform beat tracking or other similar Music Information Retrieval sub-tasks.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)