Modeling Success Factors for Start-ups in Western Europe through a Statistical Learning Approach

Detta är en Master-uppsats från KTH/Industriell ekonomi och organisation (Inst.)

Sammanfattning: The purpose of this thesis was to use a quantitative method to expand on previous research in the field of start-up success prediction. This was accomplished by including more criteria in the study, which was made possible by the Crunchbase database, which is the largest available information source for start-ups. Furthermore, the data used in this thesis was limited to Western European start-ups only in order to study the effects of limiting the data to a certain geographical region on the prediction models, which to our knowledge has not been done before in this type of research. The quantitative method used was machine learning and specifically the three machine learning predictors used in this thesis were Logistic Regression, Random Forest and K-nearest Neighbor (KNN). All three models proposed and evaluated have a better prediction accuracy than guessing the outcome at random. When tested on data previously unknown to the model, Random Forest produced the greatest results, predicting a successful company as a success and a failed company as a failure with 79 percent accuracy. With accuracies of 65 percent and 59 percent, respectively, both logistic regression and K-Nearest Neighbor (KNN) were close behind.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)