We study the performance of the Timestamp(0) (TS(0)) algorithm for self-organizing sequential search on discrete memoryless sources. We demonstrate that TS(0) is better than Move-...
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Web search engines have become the primary method of accessing information on the web. Billions of queries are submitted to major web search engines, reflecting a wide range of in...
—Current keyword search by Google, Yahoo, and so on gives enormous unsuitable results. A solution to this perhaps is to annotate semantics to textual web data to enable semantic ...
Many databases have become Web-accessible through form-based search interfaces (i.e., HTML forms) that allow users to specify complex and precise queries to access the underlying ...
Hai He, Weiyi Meng, Yiyao Lu, Clement T. Yu, Zongh...