Reducing the computational complexity of a CNN-based neural network used for partitioning in VVC compliant encoders

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Block partitioning is a computationally heavy step in the video coding process. Previously, this stage has been done using a full-search-esque algorithm. Recently, Artificial Neural Networks (ANN) approaches to speed-up block partitioning in encoders compliant to the Versatile Video Coding (VVC) standard have shown to significantly decrease the time needed for block partitioning. In this degree project, a state of the art Convolutional Neural Network (CNN) was ported to VTM16. It was ablated into 7 new models which were trained and tested. The eects of the ablations were compared and discussed with respect to the number of Multiply-Accumulate operations (MAC) a model required, the speed-up in the encoding stage as well as the quality of the encoding. The results show that the number of MACs can be substantially decreased from that of the state of the art model while having low negative eects on the quality of the encoding. Furthermore, the results show that the two tested approaches of reducing the computational complexity of the model were eective. Those were: 1) reducing the image’s resolution earlier in the model. 2) reducing the number of features in the beginning layers. The results point towards the first approach being more eective.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)