Pattern Recognition applied to Continuous integration system.

Detta är en Master-uppsats från Blekinge Tekniska Högskola/Institutionen för datalogi och datorsystemteknik

Sammanfattning: Context: Thisthesis focuses on regression testing in the continuous integration environment which is integration testing that ensures that changes made in the new development code to thesoftware product do not introduce new faults to the software product. Continuous integration is software development practice which integrates all development, testing, and deployment activities. In continuous integration,regression testing is done by manually selecting and prioritizingtestcases from a larger set of testcases. The main challenge faced using manual testcases selection and prioritization is insome caseswhereneeded testcases are ignored in subset of selected testcasesbecause testers didn’t includethem manually while designing hourly cycle regression test suite for particular feature development in product. So, Ericsson, the company in which environment this thesis is conducted,aims at improvingtheirtestcase selection and prioritization in regression testing using pattern recognition. Objectives:This thesis study suggests prediction models using pattern recognition algorithms for predicting future testcases failures using historical data. This helpsto improve the present quality of continuous integration environment by selecting appropriate subset of testcases from larger set of testcases for regression testing. There exist several candidate pattern recognition algorithms that are promising for predicting testcase failures. Based on the characteristics of the data collected at Ericsson, suitable pattern recognition algorithms are selected and predictive models are built. Finally, two predictive models are evaluated and the best performing model is integrated into the continuous integration system. Methods:Experiment research method is chosen for this research because discovery of cause and effect relationships between dependent and independent variables can be used for the evaluation of the predictive model.The experiment is conducted in RStudio, which facilitates to train the predictive models using continuous integration historical data. The predictive ability of the algorithms is evaluated using prediction accuracy evaluation metrics. Results: After implementing two predictive models (neural networks & k-nearest means) using continuous integration data, neural networks achieved aprediction accuracy of 75.3%, k-nearest neighbor gave result 67.75%. Conclusions: This research investigated the feasibility of an adaptive and self-learning test machinery by pattern recognition in continuous integration environment to improve testcase selection and prioritization in regression testing. Neural networks have proved effective capability of predicting failure testcase by 75.3% over the k-nearest neighbors.Predictive model can only make continuous integration efficient only if it has 100% prediction capability, the prediction capability of the 75.3% will not make continuous integration system more efficient than present static testcase selection and prioritization as it has deficiency of lacking prediction 25%. So, this research can only conclude that neural networks at present has 75.3% prediction capability but in future when data availability is more,this may reach to 100% predictive capability. The present Ericsson continuous integration system needs to improve its data storage for historical data at present it can only store 30 days of historical data. The predictive models require large data to give good prediction. To support continuous integration at present Ericsson is using jenkins automation server, there are other automation servers like Team city, Travis CI, Go CD, Circle CI which can store data more than 30 days using them will mitigate the problem of data storage.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)