Integrating Reinforcement Learning into Behavior Trees by Hierarchical Composition

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Mart Kartasev; [2019]

Nyckelord: ;

Sammanfattning: This thesis investigates ways to extend the use of Reinforcement Learning (RL) to Behavior Trees (BTs). BTs are used in the field of Artificial Intelligence (AI) in order to create modular and reactive planning agents. While human designed BTs are capable of reacting to changes in an environment as foreseen by an expert, they are not capable of adapting to new scenarios. The focus of the thesis is on using existing general-purpose RL methods within the framework of BTs. Deep Q-Networks (DQN) and Proximal Policy Optimisation (PPO) were embedded into BTs, using RL implementations from an open-source RL library. The experimental part of the thesis uses these nodes in a variety of scenarios of increasing complexity, demonstrating some of the benefits of combining RL and BTs. The experiments show that there are benefits to using BTs to control a set of hierarchically decomposed RL sub-tasks for solving a larger problem. Such decomposition allows for reuse of generic behaviors in different parts of a BT. By decomposing the RL problem using a BT, it is also possible to identify and replace problematic parts of a policy, as opposed to retraining the entire policy.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)