Top-k keyword search over probabilistic XML data

13 years 10 months ago

Download www.cse.unsw.edu.au

—Despite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we ﬁrstly deﬁne the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efﬁcient algorithms. The ﬁrst algorithm PrStack can ﬁnd k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efﬁciency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisﬁed SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results.

Jianxin Li, Chengfei Liu, Rui Zhou, Wei Wang

Real-time Traffic