Finding time-based listening habits in users music listening history to lower entropy in data

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: In a world where information, entertainment and e-commerce are growing rapidly in terms of volume and options, it can be challenging for individuals to find what they want. Search engines and recommendation systems have emerged as solutions, guiding the users. A typical example of this is Spotify, a music streaming company that utilises users listening data and other derived metrics to provide personalised music recommendation. Spotify has a hypothesis that external factors affect users listening preferences and that some of these external factors routinely affect the users, such as workout routines and commuting to work. This work aims to find time- based listening habits in users’ music listening history to decrease the entropy in the data, resulting in a better understanding of the users. While this work primarily targets listening habits, the method can, in theory, be applied on any time series-based dataset. Listening histories were split into hour vectors, vectors where each element represents the distribution of a label/genre played during an hour. The hour vectors allowed for a good representation of the data independent of the volume. In addition, it allowed for clustering, making it possible to find hours where similar music was played. Hour slots that routinely appeared in the same cluster became a profile, highlighting a habit. In the final implementation, a user is represented by a profile vector allowing different profiles each hour of a week. Several users were profiled with the proposed approach and evaluated in terms of decrease in Shannon entropy when profiled compared to when not profiled. On average, user entropy dropped by 9% with highs in the 50% and a small portion of users not experiencing any decrease. In addition, the profiling was evaluated by measuring cosine similarity across users listening history, resulting in a correlation between gain in cosine similarity and decrease in entropy. In conclusion, users become more predictable and interpretable when profiled. This knowledge can be used to understand users better or as a feature for recommender systems and other analysis. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)