Introducing Machine Learning in a Vectorized Digital Signal Processor

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Machine learning is rapidly being integrated into all areas of society, however, that puts a lot of pressure on resource costraint hardware such as embedded systems. The company Ericsson is gradually integrating machine learning based on neural networks, so-called deep learning, into their radio products. One promising product is their vectorized Digital Signal Processor (DSP) that are based upon the machine learning suitable Single Instruction, Multiple Data (SIMD) paradigm and Very Long Instruction Word (VLIW) architecture. However, despite the suitability of the SIMD paradigm, the embedded system needs to efficiently execute a computation-intensive deep learning algorithm with proper use of its limited resources. Therefore commonly used methods of implementing each layer of the computation-intensive Convolutional Neural Network (CNN), a type of Deep Neural Network (DNN), have been used and evaluated its implementation on the hardware and to assess the vectorized DSP’s deep learning suitability and capabilities. Despite the suitability of the hardware, the implementation utilized less than half of the available resources at all times during the execution. The main limitations were identified to be the limited 16-bit element instructions. To enhance the performance and improve the utilization of the available resources, easy-to-implement hardware instructions have been suggested. This work has made the first steps of implementing an efficiently performing CNN implementation on the examined vectorized DSP.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)