A Fault-Aware Resource Manager for Multi-Processor System-on-Chip

Detta är en Master-uppsats från Institutionen för datavetenskap; Tekniska högskolan

Författare: Bentolhoda Ghaeini; [2013]

Nyckelord: Fault tolerance; Resource manager; Threshold;

Sammanfattning: The semiconductor technology development empowers fabrication of extremelycomplex integrated circuits (ICs) that may contain billions of transistors. Suchhigh integration density enables designing an entire system onto a single chip,commonly referred to as a System-on-Chip (SoC). In order to boost performance,it is increasingly common to design SoCs that contain a number of processors, socalled multi-processor system-on-chips (MPSoCs).While on one hand, recent semiconductor technologies enable fabrication ofdevices such as MPSoCs which provide high performance, on the other hand thereis a drawback that these devices are becoming increasingly susceptible to faults.These faults may occur due to escapes from manufacturing test, aging effects orenvironmental impacts. When present in a system, faults may disrupt functionalityand can cause incorrect system operation. Therefore, it is very importantwhen designing systems to consider methods to tolerate potential faults. To copewith faults, there is a need of fault handling which implies automatic detection,identification and recovery from faults which may occur during the system’s operation.This work is about the design and implementation of a fault handling methodsfor an MPSoC. A fault aware Resource Manager (RM) is designed and implementedto obtain correct system operation and maximize the system’s throughputin the presence of faults. The RM has the responsibility of scheduling jobs to availableresources, collecting fault states from resources in the system and performingfault handling tasks, based on fault states. The RM is also employed in multipleexperiments in order to study its behavior in different situations.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)