Sökning: "Spark on Kubernetes"

Hittade 4 uppsatser innehållade orden Spark on Kubernetes.

  1. 1. A performance study for autoscaling big data analytics containerized applications : Scalability of Apache Spark on Kubernetes

    Master-uppsats, Blekinge Tekniska Högskola/Institutionen för datavetenskap

    Författare :Vinay Kumar Vennu; Sai Ram Yepuru; [2022]
    Nyckelord :Containers; Container Orchestration; Big data analytics; Autoscaling; Resource Management;

    Sammanfattning : Container technologies are rapidly changing how distributed applications are executed and managed on cloud computing resources. As containers can be deployed on a large scale, there is a tremendous need for Container Orchestration tools like Kubernetes that are highly automatic in deployment, scaling, and management. LÄS MER

  2. 2. Project based multi-tenant managed RStudio on Kubernetes for Hopsworks

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Gibson Chikafa; [2021]
    Nyckelord :Multi-tenancy; Cloud computing; Performance isolation; Security; Scaling; Docker; Kubernetes; Azure; GCP; Multitenans; Molntjänster; Prestandaisolering; Säkerhet; Skalning; Docker; Kubernetes; Azure; GCP;

    Sammanfattning : In order to fully benefit from cloud computing, services are designed following the “multi-tenant” architectural model which is aimed at maximizing resource sharing among users. However, multi-tenancy introduces challenges of security, performance isolation, scaling and customization. LÄS MER

  3. 3. Spark on Kubernetes using HopsFS as a backing store : Measuring performance of Spark with HopsFS for storing and retrieving shuffle files while running on Kubernetes

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Shivam Saini; [2020]
    Nyckelord :Spark; Kubernetes; HopsFS; Data processing; Distributed and Parallel processing;

    Sammanfattning : Data is a raw list of facts and details, such as numbers, words, measurements or observations that is not useful for us all by itself. Data processing is a technique that helps to process the data in order to get useful information out of it. Today, the world produces huge amounts of data that can not be processed using traditional methods. LÄS MER

  4. 4. Scaling cloud-native Apache Spark on Kubernetes for workloads in external storages

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Piotr Mrowczynski; [2018]
    Nyckelord :Cloud Computing; Spark on Kubernetes; Kubernetes Operator; Elastic Re- source Provisioning; Cloud-Native Architectures; Openstack Magnum; Data Mining; Cloud Computing; Spark över Kubernetes; Kubernetes Operator; Elastic Re- source Provisioning; Cloud-Native Architectures; Openstack Magnum; Containers; Data Mining;

    Sammanfattning : CERN Scalable Analytics Section currently offers shared YARN clusters to its users as monitoring, security and experiment operations. YARN clusters with data in HDFS are difficult to provision, complex to manage and resize. This imposes new data and operational challenges to satisfy future physics data processing requirements. LÄS MER