Abstract:
In applications arising in massive on-line social networks, biological networks, and knowledge graphs it is often required to find shortest length path between two given nodes. Recent results have addressed the problem of computing either exact or good approximate shortest path dis- tances efficiently. Some of these techniques also return the path corresponding to the estimated shortest path distance fast.
Many of the real-world graphs are edge-labeled graphs, i.e., each edge has a label that denotes the relationship between the two vertices connected by the edge. However, none of the techniques for estimating shortest paths work very well when we have additional constraints on the labels associated with edges that constitute the path.
In this work, we define the problem of retrieving shortest length path between two given nodes which also satisfies user-provided constraints on the set of edge labels involved in the path. We have developed SkIt index structure, which supports a wide range of label constraints on paths, and returns an accurate estimation of the shortest path that satisfies the constraints. We have conducted experiments over graphs such as social networks, and knowledge graphs that contain millions of nodes/edges, and show that SkIt index is fast, accurate in the estimated distance and has a high recall for paths that satisfy the constraints.