PageRank is a ranking method that assigns scores to web pages using the limit distribution of a random walk on the web graph. A fibration of graphs is a morphism that is a local i...
Paolo Boldi, Violetta Lonati, Massimo Santini, Seb...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Various online studies on the prevalence of spyware attest overwhelming numbers (up to 80%) of infected home computers. However, the term spyware is ambiguous and can refer to anyt...
Andreas Stamminger, Christopher Kruegel, Giovanni ...
The TREC .GOV collection makes a valuable web testbed for distributed information retrieval methods because it is naturally partitioned and includes 725 web-oriented queries with ...
In this paper, we study how to find maximal k-edge-connected subgraphs from a large graph. k-edge-connected subgraphs can be used to capture closely related vertices, and findin...