Recently, semantic text portion (STP) is getting popular in the field of Web mining. STP is a text portion in the original page which is semantically related to the anchor pointing...
Click data captures many users’ document preferences for a query and has been shown to help significantly improve search engine ranking. However, most click data is noisy and of...
Web graphs are approximate snapshots of the web, created by search engines. Their creation is an error-prone procedure that relies on the availability of Internet nodes and the fa...
Panagiotis Papadimitriou 0002, Ali Dasdan, Hector ...
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages...