Deep Reinforcement Learning for Dynamic Grasping

Detta är en Uppsats för yrkesexamina på avancerad nivå från Uppsala universitet/Avdelningen för systemteknik

Sammanfattning: Dynamic grasping is the action of, using only contact force, manipulating the position of a moving object in space. Doing so with a robot is a quite complex task in itself, but is one with wide-ranging applications. Today, the work of automating most processes in society undergoes rapidly, and for many of these processes, the grasping of objects has a natural place. This work has explored using deep reinforcement learning techniques for dynamic grasping, in a simulated environment. Deep Deterministic Policy Gradient was chosen and evaluated for the task, both by itself and in combination with the Hindsight Experience Replay buffer. The reinforcement learning agent observed the initial state of the target object and the robot in the environment, simulated using AGX Dynamics, and then determined with what speed to move to which position. The agent's chosen action was relayed to ABB's virtual controller, which controlled the robot in the simulation. This meant that the agent was tasked with, in advance, parametrizing a predefined set of instructions to the robot, in such a way that the moving target object would be grasped and picked up. Doing it in this matter, as opposed to having the agent continuously control the robot, was a necessary challenge making it possible to utilize the intelligence already created for the virtual controller. It also means that transferring the things learned by an agent in a simulated environment to a real-world environment becomes easier. The accuracy of the target policy for the simpler agent was 99.07%, while the accuracy of the agent with the more advanced replay buffer came up to 99.30%. These results show promise for the future, both as we expect further fine-tuning to raise them even more, and as they indicate that deep reinforcement learning methods can be highly applicable to the robotics systems of today. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)