Convolutional Neural Networks on FPGA and GPU on the Edge: A Comparison
Sammanfattning: When asked to implement a neural network application, the decision concerning what hardware platform to use may not always be easily made. This thesis studies various relevant platforms with regards to performance, power efficiency and usability, with the purpose of providing a basis for such decisions. The hardware platforms which are studied were a GPU, an FPGA and a CPU. The project implements Convolutional Neural Networks (CNN) on the different hardware platforms using several tools and frameworks. The final implementation uses BNN-PYNQ for the implementation on the FPGA and CPU, which provided ready-to-run code and overlays for quantized CNNs and fully connected neural networks. Next, these networks are copied using TensorFlow, and optimized to FP32, FP16 and INT8 precision using TensorRT for use on the GPU. The results indicate that the FPGA outperforms the GPU with a factor of 100 for the CNN networks, and a factor of 1000 on the fully connected networks with regards to inference speed. For power efficiency, the FPGA again outperforms the GPU. The thesis concludes that for a neural network application, an FPGA is preferred if performance is a priority. However, the GPU proved to have a greater ease of use due to the many tools and frameworks available. If easy implementation and high design flexibility is a priority, a GPU is instead recommended.
HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)