Sökning: "SQL on Hadoop"
Visar resultat 1 - 5 av 8 uppsatser innehållade orden SQL on Hadoop.
1. Hudi on Hops : Incremental Processing and Fast Data Ingestion for Hops
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : In the era of big data, data is flooding from numerous data sources and many companies have been utilizing different types of tools to load and process data from various sources in a data lake. The major challenges where different companies are facing these days are how to update data into an existing dataset without having to read the entire dataset and overwriting it to accommodate the changes which have a negative impact on the performance. LÄS MER
2. Hive, Spark, Presto for Interactive Queries on Big Data
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Traditional relational database systems can not be efficiently used to analyze data with large volume and different formats, i.e. big data. Apache Hadoop is one of the first open-source tools that provides a distributed data storage system and resource manager. LÄS MER
3. Scheduling workflows to optimize for execution time
Master-uppsats, Uppsala universitet/Institutionen för informatik och mediaSammanfattning : Many functions in today’s society are immensely dependent on data. Data drives everything from business decisions to self-driving cars to intelligent home assistants like Amazon Echo and Google Home. To make good decisions based on data, of which exabytes are generated every day, somehow that data has to be processed. LÄS MER
4. Multitenant PrestoDB as a service
Master-uppsats, KTH/Skolan för informations- och kommunikationsteknik (ICT)Sammanfattning : In recent years, there has been tremendous growth in both the volumes of data that is produced, stored, and queried by organizations. Organizations spend more money to investigate and obtain useful information or knowledge against terabytes and even petabytes of data. LÄS MER
5. Optimisation of Ad-hoc analysis of an OLAP cube using SparkSQL
Uppsats för yrkesexamina på avancerad nivå, Uppsala universitet/Avdelningen för beräkningsvetenskapSammanfattning : An Online Analytical Processing (OLAP) cube is a way to represent a multidimensional database. The multidimensional database often uses a star schema and populates it with the data from a relational database. LÄS MER