

Sailer: an effective search engine for unified retrieval of heterogeneous xml and web documents

15 years 3 months ago
Sailer: an effective search engine for unified retrieval of heterogeneous xml and web documents
This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versatilely answer keyword queries over the heterogenous data. We model the Web pages and XML documents as graphs. We propose the concept of pivotal trees to effectively answer keyword queries and present an effective method to identify the top-k pivotal trees with the highest ranks from the graphs. Moreover, we propose effective indexes to facilitate the effective unified ranked retrieval. We have conducted an extensive experimental study using real datasets, and the experimental results show that Sailer achieves both high search efficiency and accuracy, and outperforms the existing approaches significantly. Categories and Subject Descriptors H.2.8 [Database Applications ]: Miscellaneous General Terms Algorithms, Performance, Languages Keywords Keyword Search, XML, Web Pages, Unified Keyword Search
Guoliang Li, Jianhua Feng, Jianyong Wang, Xiaoming
Added 21 Nov 2009
Updated 21 Nov 2009
Type Conference
Year 2008
Where WWW
Authors Guoliang Li, Jianhua Feng, Jianyong Wang, Xiaoming Song, Lizhu Zhou
Comments (0)