Data Augmentations for Improving Vision-Based Damage Detection : in Land Transport Infrastructure

Detta är en Master-uppsats från KTH/Lantmäteri – fastighetsvetenskap och geodesi

Sammanfattning: Crack, a typical term most people know, is a common form of distress or damage in road pavements and railway sleepers. It poses significant challenges to their structural integrity, safety, and longevity. Over the years, researchers have developed various data-driven technologies for image-based crack detection in road and sleeper applications. The image-based crack detection has become a promising field.  Many researchers use ensemble learning to win the Road Damage Detection Challenge. The challenge provides a street view dataset from several countries from different perspectives. The version of the dataset is 2020, which contains images from Japan, India, and Czech. Thus, the dataset inherits a domain shift problem. Current solutions use ensemble learning to deal with such a problem. Those solutions require much computational power and challenge adaptability in real-time applications. To mitigate the problem, the thesis experiments with various data augmentation techniques that could improve the base model performance. The main focuses are erasing a crack from an image using generative AI (Erase), implementing road segmentation by using the Panoptic Segmentation (RS) and injecting a perspective-aware synthetic crack (InjectPa) into the segmented road surface in the image. The results show that compared to the base model, the Erase + RS techniques improve the model's F1 score when trained only on Japan in the dataset rather than when trained on three countries simultaneously. Moreover, the InjectPa technique does not help improve the base model in both scenarios. Then, the experiment moved to the SBB dataset containing close-up images of sleepers from cameras mounted in front of the diagnostic vehicle. This section follows the same techniques but changes the segmentation model to the Segment Anything Model (SAM) because the previous segmentation model was trained on a street view dataset, making it vulnerable to close-up images. The Erase + SAM techniques show improvement in bbox/AP and validation loss. Nevertheless, it does not improve the F1 score significantly compared to the base model.  This thesis also applies the explainable AI name D-RISE to determine which feature most influences the model decision. D-RISE shows that the augmentation model can pay attention to the damage type pothole for road pavements and defect type spalling for sleepers than other types. Finally, the thesis discusses the results and suggests a strategy for future study. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)