Evaluation of the Configurable Architecture REPLICA with Emulated Shared Memory
REPLICA is a family of novel scalable chip multiprocessors with configurable emulated shared memory architecture, whose computation model is based on the PRAM (Parallel Random Access Machine) model.
The purpose of this thesis is to, by benchmarking different types of computation problems on REPLICA, similar parallel architectures (SB-PRAM and XMT) and more diverse ones (Xeon X5660 and Tesla M2050), evaluate how REPLICA is positioned among other existing architectures, both in performance and programming effort. But it should also examine if REPLICA is more suited for any special kinds of computational problems.
By using some of the well known Berkeley dwarfs, and input from unbiased sources, such as The University of Florida Sparse Matrix Collection and Rodinia benchmark suite, we have made sure that the benchmarks measure relevant computation problems.
We show that today’s parallel architectures have some performance issues for applications with irregular memory access patterns, which the REPLICA architecture can solve. For example, REPLICA only need to be clocked with a few MHz to match both Xeon X5660 and Tesla M2050 for the irregular memory access benchmark breadth first search. By comparing the efficiency of REPLICA to a CPU (Xeon X5660), we show that it is easier to program REPLICA efficiently than today’s multiprocessors.
HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)