Assessing the Efficiency of COLMAP, DROID-SLAM, and NeRF-SLAM in 3D Road Scene Reconstruction

Detta är en Master-uppsats från Lunds universitet/Matematik LTH

Sammanfattning: 3D reconstruction is a field in computer vision which has evolved rapidly as a result of the recent advancements in deep learning. As 3D reconstruction pipelines now can run in real-time, this has opened up new possibilities for teams developing Advanced Driver Assistance Systems (ADAS), which rely on the camera system of the vehicle to enhance the safety and driving experience. This thesis presents a comparative analysis of two state-of-the-art visual SLAM pipelines, DROID-SLAM and NeRF-SLAM, and the classical Structure-from-Motion system, COLMAP. The objective was to utilize the multi-camera system on a Volvo vehicle, and public datasets, to accurately estimate trajectories and generate annotatable 3D road scenes. To assess the performance of the three methods, an evaluation pipeline was developed. The results showed that COLMAP and DROID-SLAM can generate estimated trajectories with high accuracy when utilizing the Volvo vehicle's multi-camera system. Additionally, these systems were found to be capable of creating annotatable 3D road scenes, with some differences in quality and runtime efficiency. Generally, COLMAP demonstrated high-quality results, but its extensive runtimes makes it impractical to use at scale. The method found to be the least promising for Volvo Cars' use case was NeRF-SLAM, which failed to produce acceptable reconstructions using the multi-camera system. Conclusively, DROID-SLAM showed the most potential for Volvo Cars' use case out of the three methods evaluated in this thesis. Despite being predominantly used off-the-shelf, it demonstrated the ability to generate impressive results with low runtimes. Nevertheless, additional research and fine-tuning is needed to optimize its performance for Volvo Cars' setup.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)