Abstract:
Graph databases have gained immense popularity as a leading choice for data representation and analysis, especially when modeling diverse types of networks. They are constructed using the property graph data model, involving nodes and edges valued with property-value combinations. Even though time is present in most real-world problems, most prior research in this field revolves around graphs in which the temporal aspect is overlooked. This thesis describes the model in a paper published in The VLDB Journal 30[5]. It offers an analysis of the problem of modeling, storing, and querying temporal property graphs, enabling the preservation of a graph database’s historical data. Specifically, the thesis addresses a temporal graph data model where nodes and relationships contain key-value pair attributes within a defined time interval. In this model, graphs may encompass different kinds of relationships. Also, the paper introduces a high-level graph query language known as T-GQL, accompanied by a collection of algorithms for computing various types of temporal paths within a graph. These paths capture distinct temporal path semantics, including continuous paths, pairwise continuous paths, and consecutive paths. T-GQL proves to be particularly significant, capable of expressing queries like ”Find paths between Anchorage and Los Angeles, taking into account flights where the arrival time precedes the departure time of the subsequent flight.” To validate the feasibility of the concept, a practical demonstration is provided through the utilization of Neo4j.
Moreover, a user interface on the client side facilitates submitting queries written in T-GQL to a Neo4j server. In addressing the disparity between synthetic datasets and real-world complexities, this thesis introduces the pivotal ”Price attribute, reflective of real-world aviation dynamics, into the analysis. It delves into how this variable influences algorithmic outcomes, shedding light on the intricate interplay between price and path selection. While recognizing that real-world aviation presents multifaceted attributes, the primary focus remains price and distance. This exploration aims to unravel the impact of price fluctuations on algorithmic results, offering insights into how practical considerations influence path selection. The study’s outcomes encompass experiments on a synthetic data set, demonstrating method viability and assessing the impact of variables, such as path length and graph dimensions. This thesis expansion endeavors to bridge the theory-to-practice gap, offering valuable insights into the real-world dynamics of temporal property graphs. It culminates in a comprehensive set of experiments conducted on a synthetic data set, serving a dual purpose: firstly, to validate the method’s feasibility, and secondly, to evaluate the variables influencing performance, including queried path lengths and graph dimensions.