Transformer decoder as a method to predict diagnostic trouble codes in heavy commercial vehicles

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Diagnostic trouble codes (DTC) have traditionally been used by mechanics to figure out what is wrong with a vehicle. A vehicle generates a DTC when a specific condition in the vehicle is met. This condition has been defined by an engineer and represents some fault that has happened. Therefore the intuition is that DTC’s contain useful information about the health of the vehicle. Due to the sequential ordering of DTC’s and the high count of unique values, this modality of data has characteristics that resemble those of natural language. This thesis investigates if an algorithm that has shown to be promising in the field of Natural Language Processing can be applied to sequences of DTC’s. More specifically, the deep learning model called the transformer decoder will be compared to a baseline model called n-gram in terms of how well they estimate a probability distribution of the next DTC condition on previously seen DTC’s. Estimating a probability distribution could then be useful for manufacturers of heavy commercial vehicles such as Scania when creating systems that help them in their mission of ensuring a high uptime of their vehicles. The algorithms were compared by firstly doing a hyperparameter search for both algorithms and then comparing the models using the 5x2 cross-validation paired t-test. Three metrics were evaluated, perplexity, Top- 1 accuracy, and Top-5 accuracy. It was concluded that there was a significant difference in the performance of the two models where the transformer decoder was the better method given the metrics that were used in the evaluation. The transformer decoder had a perplexity of 22.1, Top-1 accuracy of 37.5%, and a Top-5 accuracy of 59.1%. In contrast, the n-gram had a perplexity of 37.6, Top-1 accuracy of 7.5%, and a Top-5 accuracy of 30%. 

