Sökning: "MapReduce"

Visar resultat 11 - 15 av 33 uppsatser innehållade ordet MapReduce.

  1. 11. A Coordination Framework for Deploying Hadoop MapReduce Jobs on Hadoop Cluster

    Master-uppsats, KTH/Skolan för informations- och kommunikationsteknik (ICT)

    Författare :Anitha Raja; [2016]
    Nyckelord :Hadoop; Workload Characterization; Parametric Modeling; Coordination framework; OpenStack; Workload deployment; Hadoop; Arbetsbelastning Karakterisering; Parametrisk Utformning; Koordinations system; OpenStack; Arbetsbelastnings Utplacering;

    Sammanfattning : Apache Hadoop is an open source framework that delivers reliable, scalable, and distributed computing. Hadoop services are provided for distributed data storage, data processing, data access, and security. MapReduce is the heart of the Hadoop framework and was designed to process vast amounts of data distributed over a large number of nodes. LÄS MER

  2. 12. Implementation of the HadoopMapReduce algorithm on virtualizedshared storage systems

    Master-uppsats,

    Författare :Shravya Nethula; [2016]
    Nyckelord :Hadoop; virtualized systems; shared storage; MapReduce; Hadoop Distributed File System;

    Sammanfattning : Context Hadoop is an open-source software framework developed for distributed storage and distributed processing of large sets of data. The implementation of the Hadoop MapReduce algorithm on virtualized shared storage by eliminating the concept of Hadoop Distributed File System (HDFS) is a challenging task. LÄS MER

  3. 13. Evaluation and benchmarking of Tachyon as a memory-centric distributed storage system for Apache Hadoop

    Master-uppsats, KTH/Skolan för informations- och kommunikationsteknik (ICT)

    Författare :Ioannis Kerkinos; [2016]
    Nyckelord :;

    Sammanfattning : Hadoop was developed as an open-source software framework that leveraged initially the MapReduce programming model and therefore was able to efficiently analyse and process large datasets. At the core of Hadoop is the Hadoop distributed file system or HDFS, which is used as the default storage across the cluster. LÄS MER

  4. 14. Natural Language Processing In A Distributed  Environment : A comparative performance analysis of Apache Spark and Hadoop MapReduce

    Kandidat-uppsats, Umeå universitet/Institutionen för datavetenskap

    Författare :Ludwig Andersson; [2016]
    Nyckelord :;

    Sammanfattning : A big majority of the data hosted on the internet today is in natural text and therefore understanding natural language and how to effectively process and analyzing text has become a big part of data mining. Natural Language Processing has many applications in fields such as business intelligence and security purposes. LÄS MER

  5. 15. Big Data Arkitekturer : En studie om tekniska och organisatoriska för- och nackdelar vid val av Big Data arkitektur.

    Kandidat-uppsats, Högskolan i Skövde/Institutionen för informationsteknologi

    Författare :Rebecca Bodegård Gustafsson; [2015]
    Nyckelord :Big Data arkitekturer databaser apache hadoop postgres-xl;

    Sammanfattning : Explosionen av sociala medier, användandet av smartphones och den ständiga uppkopplingen till internet gör att mängden tillgänglig data idag hela tiden ökar. Det rör sig dock inte bara om stora mängder data utan också om data som har en hög inflödeshastighet samt är av varierade datatyp. LÄS MER