Shoppin’ in the Rain : An Evaluation of the Usefulness of Weather-Based Features for an ML Ranking Model in the Setting of Children’s Clothing Online Retailing

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Online shopping offers numerous benefits, but large product catalogs make it difficult for shoppers to understand the existence and characteristics of every item for sale. To simplify the decision-making process, online retailers use ranking models to recommend products relevant to each individual user. Contextual user data, such as location, time, or local weather conditions, can serve as valuable features for ranking models, enabling personalized real-time recommendations. Little research has been published on the usefulness of weather-based features for ranking models in online clothing retailing, which makes additional research into this topic worthwhile. Using Swedish sales and customer data from Babyshop, an online retailer of children’s fashion, this study examined possible correlations between local weather data and sales. This was done by comparing differences in daily weather and differences in daily shares of sold items per clothing category for two cities: Stockholm and Göteborg. With Malmö as an additional city, historical observational weather data from one location each in the three cities Stockholm, Göteborg, and Malmö was then featurized and used along with the customers’ postal towns, sales features, and sales trend features to train and evaluate the ranking relevancy of a gradient boosted decision trees learning to rank LightGBM ranking model with weather features. The ranking relevancy was compared against a LightGBM baseline that omitted the weather features and a naive baseline: a popularity-based ranker. Several possible correlations between a clothing category such as shorts, rainwear, shell jackets, winter wear, and a weather variable such as feels-like temperature, solar energy, wind speed, precipitation, snow, and snow depth were found. Evaluation of the ranking relevancy was done using the mean reciprocal rank and the mean average precision @ 10 on a small dataset consisting only of customer data from the postal towns Stockholm, Göteborg, and Malmö and also on a larger dataset where customers in postal towns from larger geographical areas had their home locations approximated as Stockholm, Göteborg or Malmö. The LightGBM rankers beat the naive baseline in three out of four configurations, and the ranker with weather features outperformed the LightGBM baseline by 1.1 to 2.2 percent across all configurations. The findings can potentially help online clothing retailers create more relevant product recommendations.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)