Using Map-Reduce for Large Scale Analysis of Graph-Based Data

Detta är en Master-uppsats från KTH/Skolan för informations- och kommunikationsteknik (ICT)

Författare: Nan Gong; [2011]

Nyckelord: ;

Sammanfattning: As social networks have gained in popularity, maintaining and processing the social network graph information using graph algorithms has become an essential source for discovering potential features of the graph. The escalating size of the social networks has made it impossible to process the huge graphs on a single ma chine in a “real-time” level of execution. This thesis is looking into representing and distributing graph-based algorithms using Map-Reduce model. Graph-based algorithms are discussed in the beginning. Then, several distributed graph computing infrastructures are reviewed, followed by Map-Reduce introduction and some graph computation toolkits based on Map-Reduce model. By reviewing the background and related work, graph-based algorithms are categorized, and adaptation of graph-based algorithms to Map-Reduce model is discussed. Two particular algorithms, MCL and DBSCAN are chosen to be designed using Map- Reduce model, and implemented using Hadoop. New matrix multiplication method is proposed while designing MCL. The DBSCAN is reformulated into connectivity problem using filter method, and Kingdom Expansion Game is proposed to do fast expansion. Scalability and performance of these new designs are evaluated. Conclusion is made according to the literature study, practical design experience and evaluation data. Some suggestions of graph-based algorithms design using Map-Reduce model are also given in the end.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)