Implementing the GraphQL Interface on top of a Graph Database

Detta är en Master-uppsats från Linköpings universitet/Databas och informationsteknik; Linköpings universitet/Tekniska fakulteten

Författare: Linn Mattsson; [2020]

Nyckelord: GraphQL; Graph Databases; Performance; Logical Plans;

Sammanfattning: Since becoming an open source project in 2015, GraphQL has gained popularity as it is used as a query language from front-end to back-end, ensuring that no over-fetching or under-fetching is performed. While the query language has been openly available for a few years, there has been little academic research in this area. The aim of this thesis is to create an approach for using GraphQL on top of a graph database, as well as evaluate the optimisation techniques available for this approach. This was done by developing logical plans and query executions plans, and the suitable optimisation technique was found to be parallel execution and batching of database calls. The implementation was done in Java by using graph computing framework Apache TinkerPop, which is compatible with a number of graph databases. However, this implementation focuses on graph database management system Neo4j. To evaluate the implementation, query templates and data from Linköping GraphQL Benchmark was used. The logical plans were created by converting a GraphQL query into a tree of logical operators. The query execution plans were based on four different primitives from the Apache TinkerPop framework, and the physical operators were each influenced by one or more logical operators. The performance tests of the implementation showed that the query execution times were largely dependant on the query template as well as the number of database nodes visited. The pattern between execution times and the number of threads used in the parallel execution was concluded as lower execution times (<100 ms) were improved when 4-6 threads are used, while higher execution times were improved for 12-24 threads used. For the very fast query executions (<5 ms), using threading caused more overhead than the time saved by parallel execution, and for these cases it was better to not use any threading.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)