Investigation of Different Observation and Action Spaces for Reinforcement Learning on Reaching Tasks

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Ching-an Wu; [2019]

Nyckelord: ;

Sammanfattning: Deep reinforcement learning has been shown to be a potential alternative to a traditional controller for robotic manipulation tasks. Most of modern deep reinforcement learning methods that are used on robotic control mostly fall in the so-called model-free paradigm. While model-free methods require less space and have better generalization capability compared to model-based methods, they suffer from higher sample complexity which leads to the problem of sample ineffi ciency. In this thesis, we analyze three modern deep reinforcement learning, model-free methods: deep Q-network, deep deterministic policy gradient, and proximal policy optimization under different representations of the state-action space to gain a better insight of the relation between sample complexity and sample effi ciency. The experiments are conducted on two robotic reaching tasks. The experimental results show that the complexity of observation and action space are highly related to the sample effi ciency during training. This conclusion is in line with corresponding theoretical work in the field.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)