Classification of musical genres using hidden Markov models

Detta är en Master-uppsats från Lunds universitet/Matematisk statistik

Sammanfattning: The music content online is expanding fast, and music streaming services are in need for algorithms that sort new music. Sorting music by their characteristics often comes down to considering the genre of the music. Numerous studies have been made on automatic classification of audio files using spectral analysis and machine learning methods. However, many of the completed studies have been unrealistic in terms of usefulness in real settings, choosing genres that are very dissimilar. The aim of this master’s thesis is to try a more realistic scenario, with genres of which the border between them is uncertain, such as Pop and R&B. Mel-frequency cepstral coefficients (MFCCs) were extracted from audio files and used as a multidimensional Gaussian input to a hidden Markov model (HMM) to classify the four genres Pop, Jazz, Classical and R&B. An alternative method is tested, using a more theoretical approach of music characteristics to improve classification. The maximum total accuracy obtained when tested on an external test set was 0.742 for audio data, and 0.540 for theoretical data, implying that a combination of the two methods will not result in an increase of accuracy. Different methods of evaluation and possible alternative approaches are discussed.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)