Learning Operational Goals for Propulsion System Using Reinforcement Learning
Sammanfattning: This degree project, conducted at ABB, aims to analyze and solve differentsituations that a crew on board a vessel might face by controllingits propulsion system. The propulsion system is viewed as static,transition-deterministic, as well as stochastic when measuring data.This system is then used to formulate a decision problem using a finiteMarkov Decision Process, which is attempted to be tackled usingQ-learning, Speedy Q-learning and Double Q-learning for three differentobjectives that are relevant to the system’s behaviour and performance.The objective policies found from experiments are clearlyworking as intended and from the looks of experiments it seems thatmore training very much does affect the performance, which should bethe case knowing that there is a proof of convergence for Q-learningbased algorithms. The convergence rates for the three different algorithmsare then compared to a solution that is seen as optimal, to seehow fast they converge and try to determine the time needed to solveproblems similar to the ones stated in this thesis.
HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)