Multi-Modal Visual Tracking Using Infrared Imagery

Detta är en Master-uppsats från Linköpings universitet/Datorseende

Sammanfattning: Generic visual object tracking is the task of tracking one or several objects in all frames in a video, knowing only the location and size of the target in the initial frame. Visual tracking can be carried out in both the infrared and the visual spectrum simultaneously, this is known as multi-modal tracking. Utilizing both spectra can result in a more diverse tracker since visual tracking in infrared imagery makes it possible to detect objects even in poor visibility or in complete darkness. However, infrared imagery lacks the number of details that are present in visual images. A common method for visual tracking is to use discriminative correlation filters (DCF). These correlation filters are then used to detect an object in every frame of an image sequence. This thesis focuses on investigating aspects of a DCF based tracker, operating in the two different modalities, infrared and visual imagery. First, it was investigated whether the tracking benefits from using two channels instead of one and what happens to the tracking result if one of those channels is degraded by an external cause. It was also investigated if the addition of image features can further improve the tracking. The result shows that the tracking improves when using two channels instead of only using a single channel. It also shows that utilizing two channels is a good way to create a robust tracker, which is still able to perform even though one of the channels is degraded. Using deep features, extracted from a pre-trained convolutional neural network, was the image feature improving the tracking the most, although the implementation of the deep features made the tracking significantly slower. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)