Locality-aware loadbalancing in a Service Mesh

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Most services today are developed with a microservice architecture where each component is deployed with multiple replicas on servers all over the world. When requests go between service components, the role of a load balancer is to route each request to the least loaded instance of the target component. There are many algorithms that evaluate different parameters and select an instance from those. One approach is to optimize for latency, i.e., choose the instance that will result in the lowest latency. However, this approach does not take into consideration the geographical distribution of servers, or when requests have to cross networking boundaries, i.e., go from one physical data center to another. Crossing networking boundaries comes with an increased cost as connecting two data centers far apart is an expensive task. Therefore, the cloud computing provider will charge this traffic more than when just sending traffic within a single data center. This study set out to use Google Traffic Director, a service mesh that has information about the whole system and can, therefore, offer locality-aware load-balancing that tries to minimize the amount of traffic that crosses networking boundaries. This is compared to a latency-based algorithm without a service mesh architecture, namely Expected Latency Selector. The study was set up to evaluate how the different approaches performed in terms of cost, latency, and resilience. This evaluation was performed by setting up two testing environments where both load-balancing algorithms could run and relevant metrics were collected. This was then tested in three different scenarios: no disturbance, random delay in a zone, and the final being a zone failing all requests. Results show that in a perfect environment, a locality-aware approach with Traffic Director can reduce the networking cost to an optimal level by only sending a negligible amount of requests cross-zone, while still performing equally well as the latency-based approach in terms of latency. However, when a delay or failure is introduced, Traffic Director, in our setup, keeps the same behavior of prioritizing the locality instead of distributing requests to other zones to even out the latency and circumvent the faulty servers. 

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)