Live updates in High-availability (HA) clouds

Detta är en Master-uppsats från

Sammanfattning: Background. High-availability (HA) is a cloud’s ability to keep functioning after one or more hardware or software components fail. Its purpose is to minimize the system downtime and data loss. Many service providers guarantee a Service Level Agreement including uptime percentage of the computing service, which is calculated based on the available time and system downtime excluding the planned outage time. The aim of the thesis is to perform the update of the virtual machines running in the cloud without causing any interruptions to the user by redirecting the resources/services running on them to an alternative virtual machine before the original VM is updated. Objectives. The objectives for the above aim include. • The first objective is to investigate existing solutions for high-availability and, if possible, adapt them to our aim. The alternative is to design our own solution. • The second objective is to implement the solution in an Open Stack environment. As an alternative, we can try a smaller scale implementation under a virtualization platform such as Virtual Box. • The final objective is to run experiments to quantify the effectiveness of our solution in terms of overhead and degree of seamlessness to the users. Methods. An environment with multiple virtual machines may be created to represent multiple virtual servers in the cloud. The state of service provided by the primary virtual machine is saved to persistent storage and the client is redirected to an alternate virtual machine. At that point the primary virtual machine may reboot for an update or any other issues. Results. In the case of CPU Utilization, the mean CPU utilization on Server and Host in scenario 1 are 0.34% and 3.2% respectively. The mean CPU utilization on Primary server and Host in scenario 2 during the failover cycle are 2.0% and 9.7% respectively. The mean CPU utilization on Secondary server and Host in scenario 2 during failover cycle are 0.99% and 8.0% respectively. For the Memory Utilization, the mean Memory usage on server in scenario 1 is 16%. The mean Memory usage on primary server and secondary server in scenario 2 during failover cycle are 37% and 48% respectively. The Time for failover of the high availability environment remains for 6.8 seconds and the time for the off-line node to rejoin the cluster as on-line when told would take 1.5 seconds. The network traffic is measured in Kilobits per second, it is 1.2 Kilobits per second on port 80 in scenario 2 and is 1.4 Kilobits per second between the client and the server in scenario 1. In addition, data traffic on ports 5405, 2224 and 7788 are captured where port 5405 (Pacemaker/Corosync) contains UDP traffic, port 2224 (Pcsd) contains TCP traffic and port 7788 (DRBD) contains TCP traffic. The traffic captured on these ports represent network overhead due to HA. During failover cycle an additional traffic of 45Kb/s, 1.2Kb/s. 7.0Kb/s flow on 5405, 2224 and 7788 ports respectively. Conclusions. From our experiment results we can say that the overhead to handle live updates on high availability environment is approximately 1.1 - 1.7 % of CPU higher in HA mode than when a stand-alone server is used. The overhead is around 21 - 32 % higher in terms of memory utilization for the live updates on the HA system than for the standard server. The network traffic overhead induced by the ports used by high availability environment (5405, 2224, 7788) is approximately 53 Kilobits /Second while the minimum overhead is approximately 16 Kilobits / Second. The Final and the important metric is the Failover time which tells the seamlessness of the service as the environment needs to provide the services uninterrupted to the users. The failover time of the HA model is about just 6.8 seconds leaving the environment highly available. However, the user may notice slight interruption for the requests made during this span.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)