Self-adaptive and hierarchical membership management in distributed system

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Jiangfeng Du; [2018]

Nyckelord: ;

Sammanfattning: Cloud computing is widely deployed in industry, enablingcustomers to save infrastructure acquisition and networkdeployment costs. To provide better performance of applicationsdriven by cloud computing, management of theresource pool in the cloud needs to be effective. For efficientcommunication within the resource pool, an overlay ofnodes is formed. Membership management maintains themembership lists and relationship for all nodes in the overlay,detects the member changes, and exposes the membershiplist to other management components or upper layerservices. However, most existing solutions only maintain arelative static overlay which cannot reflect the real-time anddynamic network and system conditions in the systems, andthus the performance of those tasks or services dependingon the membership management will be less optimized.To deal with the problem, we proposed and implementeda self-adaptive and hierarchical membership managementsystem. The structure of the overlay could be dynamicallychanged according to predefined and real-time costvalues between each pair of nodes. A transfer approach isproposed in order to move one node from a cluster to anothercluster and meanwhile replace a high-cost link witha low-cost link; a merge approach is proposed in order todecrease the amount of clusters with relatively small sizeand improve the connectivity of the whole overlay network.Ideally, the resulting overlay network will possess a fullyconnected and tree-based hierarchical structure with minimumoverall cost. The optimized structure could benefitthe services running on top, such as resource schedulingand task placement.The system is evaluated in an emulated environment.Results from experiments show that the structure couldadapt to cost changes and the overall cost can be reducedwhen the parameters are set properly. The communicationoverhead incurred by messages keeps low for non-leadernodes, grows with the level for leaders nodes and does notincrease a lot with the number of nodes in the system. Thefailure of nodes can be detected with high accuracy andrelatively low latency. Moreover, the whole structure couldrecover from the events like leader failure timely.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)