Query processing over graph-structured data is enjoying a growing number of applications. Keyword search on a graph finds a set of answers, each of which is a substructure of the graph containing all query keywords. Existing approaches start by identifying nodes containing at least one keyword and then use various expansion or join techniques to grow partial answers until k top-ranking complete answers are found. These approaches suffer from several drawbacks, e.g., not taking full advantage of indexes, search strategies with no worst-case guarantees, and high memory requirements. To address these problems, we propose BLINKS, a bi-level indexing and query processing scheme for top-k keyword search on directed graphs. BLINKS follows a search strategy with provable performance bounds, while additionally exploiting a novel bi-level index for pruning and accelerating the search. To reduce the storage requirement of the index, BLINKS partitions a data graph into blocks: The bi-level index ...
Haixun Wang, Hao He, Jun Yang 0001, Philip S. Yu