A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
A major challenge in developing models for hypertext retrieval is to effectively combine content information with the link structure available in hypertext collections. Although s...
In this paper we present an analysis of HTTP traffic captured from Internet caf?es and kiosks from two different developing countries ? Cambodia and Ghana. This paper has two main...
Without the proliferation of formal semantic annotations, the Semantic Web is certainly doomed to failure. In earlier work we presented a new paradigm to avoid this: the 'Sel...
Understanding and managing the response time of web services is of key importance as dependence on the World Wide Web continues to grow. We present Remote Latency-based Management...