Evaluation of communication protocol performance for use in reinforcement learning training in simulation

Detta är en Master-uppsats från Umeå universitet/Institutionen för datavetenskap

Sammanfattning: Since artificial intelligence (AI) is growing more prominent it is interesting to look at the methods used to train AI. One such method is reinforcement learning in simulation, where AI can train safely in the confines of a simulation. For this a simulation environment is needed which in turn needs to communicate with a reinforcement learning system. It is therefore interesting to look at how this communication may affect the performance of the system. This study is made to look at this question.  A few different communication protocols are evaluated in a test program using data of the same kind used in reinforcement learning systems, floating point numbers and images. These protocols are sockets, Socket.IO, gRPC, and ZeroMQ. Of the protocols sockets and ZeroMQ are shown to be similar for sending floats, with ZeroMQ being better performing at sending images. For larger amounts of data sockets are however better. ZeroMQ is considered the best choice for an application dealing with floats and images, due to the performance and more built in ease of use functionality compared to sockets.  ZeroMQ is adapted into a working example for reinforcement learning training in simulation using Unreal Engine as a simulation environment, AGX Dynamics for physics simulation, and Stable Baselines3 for reinforcement learning. Performance in the simulation is similar but slower than in the test program. In the small example used the reinforcement learning process is the slowest part of the system, the simulation is the next slowest at a third of the time of the reinforcement learning, communication back and forth stands for half of the simulation time. As the system grows more complexreinforcement learning time and simulation time are expected to grow at a much faster speed than thecommunication. Therefore if optimization is to be made it is likely better to focus at the other part first. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)