The problem of finding similar pages to a given web page arises in many web applications such as search engine. In this paper, we focus on the link-based similarity measures whic...
Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including ...
PageRank is a popularity measure designed by Google to rank Web pages. Experiments confirm that PageRank values obey a power law with the same exponent as In-Degree values. This ...
Nelly Litvak, Werner R. W. Scheinhardt, Yana Volko...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
Search engines use content and link information to crawl, index, retrieve, and rank Web pages. The correlations between similarity measures based on these cues and on semantic ass...