A Multi-camera based Next Best View Approach for Semantic Scene Understanding

Detta är en Uppsats för yrkesexamina på avancerad nivå från Högskolan i Halmstad/Akademin för informationsteknologi

Sammanfattning: Robots are becoming more common; robotics has gone from bleeding-edge technology to an everyday topic that families discuss around thedinner table.The number of robots in the industry is growing, which means thatthe demand and need for robots to understand the environment it isworking in is also growing.The standard method for a robot to gather information about a sceneinvolves moving to different pre-determined poses from which it canview and analyze the scene. However, this approach does not con-sider the topology of the scene that the robot should explore.This thesis aims to create a two-dimensional approach to determinethe next best view ( 2D-NBV) to view and explore the scene, intro-duced in the method section.The 2D-NBV method converts a point cloud of the scene to an ele-vation map. A segmenting network is used to get the positions ofpre-trained objects. The positions are then used to generate a2DGaussian kernel heatmap of the scene. Using the 2D elevation andGaussian map, the NBV pose is then calculated. The NBV pose isthen converted back to a 6D pose that the robot moves to capture anew point cloud and register it to the scene.The 2D-NBV method is compared to a baseline and a state-of-the-artmethod. The baseline method captures four different point cloudsfrom pre-determined positions and registers them together. The state-of-the-art methods find a point of interest and declare a set of viewcandidates on a sphere around the point. Ray casting is used to findthe pose with the highest information gain. This pose is set as theNBV for the robot to move to. The goal of this thesis is that themethod should perform better than the baseline method, describedfurther in the method section.The evaluation metric used in this thesis is how wellthe differentmethods could estimate the bounding boxes of pre-trained items us-ing an off-the-shelf semantic scene segmentation method. Six sceneswith varying difficulty were constructed to test the methods.The results showed that the 2D-NBV method successfully comple-mented the scene with information about its empty cells. The 2D-NBV outperforms the state-of-the-art on occluded scenes. The 2D-NBV performed overall just as well as the baseline. The reason thatthe 2D-NBV did not outperform the baseline is seen as a consequenceof the information loss going from 3D to 2D.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)