In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engin...
Alexandros Ntoulas, Marc Najork, Mark Manasse, Den...
This paper introduces a method for automatically partitioning richly-formatted electronic documents. An automatic partitioning system has many potential uses, but we focus here on ...
It is well known that Web-page classification can be enhanced by using hyperlinks that provide linkages between Web pages. However, in the Web space, hyperlinks are usually sparse...
Web applications facilitated by technologies such as JavaScript, DHTML, AJAX, and Flash use a considerable amount of dynamic web content that is either inaccessible or unusable by...
Yevgen Borodin, Jeffrey P. Bigham, Rohit Raman, I....
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...