Overcoming The New Item Problem In Recommender Systems : A Method For Predicting User Preferences Of New Items

Detta är en Master-uppsats från Stockholms universitet/Statistiska institutionen

Sammanfattning: This thesis addresses the new item problem in recommender systems, which pertains to the challenges of providing personalized recommendations for items which have limited user interaction history. The study proposes and evaluates a method for generating personalized recommendations for movies, shows, and series on one of Sweden’s largest streaming platforms. By treating these items as documents of the attributes which characterize them and utilizing item similarity through the k-nearest neighbor algorithm, user preferences for new items are predicted based on users’ past preferences for similar items. Two models for feature representation, namely the Vector Space Model (VSM) and a Latent Dirichlet Allocation (LDA) topic model, are considered and compared. The k-nearest neighbor algorithm is utilized to identify similar items for each type of representation, with cosine distance for VSM and Kullback-Leibler divergence for LDA. Furthermore, three different ways of predicting user preferences based on the preferences for the neighbors are presented and compared. The performances of the models in terms of predicting preferences for new items are evaluated with historical streaming data. The results indicate the potential of leveraging item similarity and previous streaming history to predict preferences of new items. The VSM representation proved more successful; using this representation, 77 percent of actual positive instances were correctly classified as positive. For both types of representations, giving higher weight to preferences for more similar items when predicting preferences yielded higher F2 scores, and optimizing for the F2 score implied that recommendations should be made when there is the slightest indication of preference for the neighboring items. The results indicate that the neighbors identified through the VSM representation were more representative of user preferences for new items, compared to those identified through the LDA representation.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)