Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Web search engines have become the primary method of accessing information on the web. Billions of queries are submitted to major web search engines, reflecting a wide range of in...
—Current keyword search by Google, Yahoo, and so on gives enormous unsuitable results. A solution to this perhaps is to annotate semantics to textual web data to enable semantic ...
Many databases have become Web-accessible through form-based search interfaces (i.e., HTML forms) that allow users to specify complex and precise queries to access the underlying ...
Hai He, Weiyi Meng, Yiyao Lu, Clement T. Yu, Zongh...
This paper addresses several problems associated with the specification of Web searches, and the retrieval, filtering, and rating of Web pages in order to improve the relevance, pr...