Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose t...
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei...
Web information retrieval is best known for its use of the Web’s link structure as a source of evidence. Global link evidence is by nature query-independent, and is therefore no ...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Blogging, as a subset of the web as a whole, can benefit greatly from the addition of semantic metadata. The result -- which we will call Semantic Blogging -- provides improved cap...
A link farm is a set of web pages constructed to mislead the importance of target pages in search engine results by boosting their link-based ranking scores. In this paper, we int...