Sökning: "Dataprocessering"
Hittade 4 uppsatser innehållade ordet Dataprocessering.
1. A Comparative Study on Efficiency and Scalability of Integer and String Datasets in cuDF and pandas
Kandidat-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : This thesis presents a comparative analysis of cuDF and pandas, two Python data processing libraries, with a focus on performance, limitations, and scalability when handling integer and string datasets. The study aims to assess the efficiency and suitability of cuDF as a potential alternative to pandas in scenarios where high-performance data processing is required. LÄS MER
2. Highly Available Task Scheduling in Distinctly Branched Directed Acyclic Graphs
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Big data processing frameworks utilizing distributed frameworks to parallelize the computing of datasets have become a staple part of the data engineering and data science pipelines. One of the more known frameworks is Dask, a widely utilized distributed framework used for parallelizing data processing jobs. LÄS MER
3. Scaling cloud-native Apache Spark on Kubernetes for workloads in external storages
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : CERN Scalable Analytics Section currently offers shared YARN clusters to its users as monitoring, security and experiment operations. YARN clusters with data in HDFS are difficult to provision, complex to manage and resize. This imposes new data and operational challenges to satisfy future physics data processing requirements. LÄS MER
4. Integrating Pig and Stratosphere
Master-uppsats, KTH/Skolan för informations- och kommunikationsteknik (ICT)Sammanfattning : MapReduce is a wide-spread programming model for processing big amounts of data in parallel. PACT is a generalization of MapReduce, based on the concept of Parallelization Contracts (PACTs). Writing efficient applications in MapReduce or PACT requires strong programming skills and in-depth understanding of the systems’ architectures. LÄS MER