Flight Sorting Algorithm Based on Users’ Behaviour

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: The model predicts the best flight order and recommend best flight to users. The thesis could be divided into the following three parts: Feature choosing, data-preprocessing, and various algorithms experiment. For feature choosing, besides the original information of flight itself, we add the user’s selection status into our model, which the flight class is, together with children or not. In the data preprocessing stage, data cleaning is used to process incomplete and repeated data. Then a normalization method removes the noise in the data. After various balancing processing, the class-imbalance data is corrected best with SMOTE method. Based on our existing data, I choose the classification model and Sequential ranking algorithm. Use price, direct flight or not, travel time, etc. as features, and click or not as label. The classification algorithms I used includes Logistic Regression, Gradient Boosting, KNN, Decision Tree, Random Forest, Gaussian Process Classifier, Gaussian NB Bayesian and Quadratic Discriminant Analysis. In addition, we also adopted Sequential ranking algorithm. The results show that Random Forest-SMOTE performs best with AUC of ROC=0.94, accuracy=0.8998. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)