Sökning: "HopsWorks"
Visar resultat 1 - 5 av 13 uppsatser innehållade ordet HopsWorks.
1. Faster Reading with DuckDB and Arrow Flight on Hopsworks : Benchmark and Performance Evaluation of Offline Feature Stores
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Over the last few years, Machine Learning has become a huge field with “Big Tech” companies sharing their experiences building machine learning infrastructure. Feature Stores, used as centralized data repositories for machine learning features, are seen as a central component to operational and scalable machine learning. LÄS MER
2. Data Build Tool (DBT) Jobs in Hopsworks
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : Feature engineering at scale is always critical and challenging in the machine learning pipeline. Modern data warehouses enable data analysts to do feature engineering by transforming, validating and aggregating data in Structured Query Language (SQL). LÄS MER
3. Project based multi-tenant managed RStudio on Kubernetes for Hopsworks
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : In order to fully benefit from cloud computing, services are designed following the “multi-tenant” architectural model which is aimed at maximizing resource sharing among users. However, multi-tenancy introduces challenges of security, performance isolation, scaling and customization. LÄS MER
4. Project-based Multi-tenant Container Registry For Hopsworks
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : There has been a substantial growth in the usage of data in the past decade, cloud technologies and big data platforms have gained popularity as they help in processing such data on a large scale. Hopsworks is such a managed plat- form for scale out data science. LÄS MER
5. Hudi on Hops : Incremental Processing and Fast Data Ingestion for Hops
Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)Sammanfattning : In the era of big data, data is flooding from numerous data sources and many companies have been utilizing different types of tools to load and process data from various sources in a data lake. The major challenges where different companies are facing these days are how to update data into an existing dataset without having to read the entire dataset and overwriting it to accommodate the changes which have a negative impact on the performance. LÄS MER