Search engines are among the most important applications or services on the web. Most existing successful search engines use global ranking algorithms to generate the ranking of do...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
Finding relationships between entities on the Web, e.g., the connections between different places or the commonalities of people, is a novel and challenging problem. Existing Web ...
Web search engines discover indexable documents by recursively ‘crawling’ from a seed URL. Their rankings take into account link popularity. While this works well, it introduc...
Tom Rowlands, David Hawking, Ramesh Sankaranarayan...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...