The Shortcut Index

Detta är en Uppsats för yrkesexamina på avancerad nivå från Lunds universitet/Institutionen för datavetenskap

Sammanfattning: With a novel path index design, called the Shortcut Index, we partially solve the problem of executing traversal queries on dense neighborhoods in a graph database. We implement our design on top of the graph database Neo4j but it could be used for any graph database that uses the labeled property graph model. By using a B+ tree, the Shortcut Index can achieve what we call neighborhood locality and range locality of paths. This means that data that belongs to the same part of the graph is located in the same space on disk. We empirically evaluate how this affects performance in terms of response time. In our benchmarks we use two different datasets, one that simulates a real world use case and a "lab environment" that makes it possible to vary neighborhood density and percent of neighborhood interest to more accurately examine how it affects performance. Our experiments show that response time of the index scales very well with neighborhood density and percent of neighborhood interest when compared to Neo4j without the index. We put focus on making the Shortcut Index useful in an OLTP (online transaction processing) environment, which enforces restrictions on update overhead which in turn restricts how long paths that can be indexed. The presented design can index arbitrary long paths but in our implementation we only index paths of length one. We conclude that the Shortcut Index improves response time at a reasonable cost and is especially useful when indexing dense neighborhoods that are often queried with limitations on some property value.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)