In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engin...
Alexandros Ntoulas, Marc Najork, Mark Manasse, Den...
Since the publication of Brin and Page's paper on PageRank, many in the Web community have depended on PageRank for the static (query-independent) ordering of Web pages. We s...
Missing web pages, URIs that return the 404 “Page Not Found” error or the HTTP response code 200 but dereference unexpected content, are ubiquitous in today’s browsing exper...
Martin Klein, Jeffery L. Shipman, Michael L. Nelso...
With the increase in information on the World Wide Web it has become difficult to quickly find desired information without using multiple queries or using a topic-specific search ...
Sofus A. Macskassy, Arunava Banerjee, Brian D. Dav...
This paper presents an extensive study about the evolution of textual content on the Web, which shows how some new pages are created from scratch while others are created using al...