Performance Evaluation of Imitation Learning Algorithms with Human Experts

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: The purpose of this thesis was to compare the performance of three different imitation learning algorithms with human experts, with limited expert time. The central question was, ”How should one implement imitation learning in a simulated car racing environment, using human experts, to achieve the best performance when access to the experts is limited?”. We limited the work to only consider the three algorithms Behavior Cloning, DAGGER, and HG-DAGGER and limited the implementation to the car racing simulator TORCS. The agents consisted of the same type of feedforward neural network that utilized sensor data provided by TORCS. Through comparison in the performance of the different algorithms on a different amount of expert time, we can conclude that HGDAGGER performed the best. In this case, performance is regarded as a distance covered given set time. Its performance also seemed to scale well with more expert time, which the others did not. This result confirmed previously published results when comparing these algorithms.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)