Published experiments on spidering the Web suggest that, given training data in the form of a (relatively small) subgraph of the Web containing a subset of a selected class of tar...
For most scientists, their research interests are dynamically changing all the time. Through an analysis of research interests, we find that all the changes are with some character...
As a good complement to page content, anchor texts have been extensively used, and proven to be useful, in commercial search engines. However, anchor texts have been assumed to be...
Zhicheng Dou, Ruihua Song, Jian-Yun Nie, Ji-Rong W...
A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...
Abstract. This paper outlines the technical details of a prototype system for searching and browsing over a million images from the World Wide Web using their visual contents. The ...