Safe Reinforcement Learning for Social Human-Robot Interaction : Shielding for Appropriate Backchanneling Behavior

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Achieving appropriate and natural backchanneling behavior in social robots remains a challenge in Human-Robot Interaction (HRI). This thesis addresses this issue by utilizing methods from Safe Reinforcement Learning in particular shielding to improve social robot backchanneling behavior. The aim of the study is to develop and implement a safety shield that guarantees appropriate backchanneling. In order to achieve that, a Recurrent Neural Network (RNN) is trained on a human-human conversational dataset. Two agents are built; one uses a random algorithm to backchannel and another uses shields on top of its algorithm. The two agents are tested using a recorded human audio, and later evaluated in a between-subject user study with 41 participants. The results did not show any statistical significance between the two conditions, for the chosen significance level of α < 0.05. However, we observe that the agent with shield had a better listening behavior, more appropriate backchanneling behavior and missed less backchanneling opportunities than the agent without shields. This could indicate that shields have a positive impact on the robot’s behavior. We discuss potential explanations for why we did not obtain statistical significance and shed light on the potential for further exploration.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)