Abstract. We present a method to extract a time series (Number of Active Requests (NAR)) from web cache logs which serves as a transport level measurement of internet traffic. This...
A similarity measure is described that does not require the prior specification of features or the need for training sets of representative data. Instead large numbers of feature...
We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
Gabor filters are biologically motivated convolution kernels that have been widely used in the field of computer vision and, specially, in face recognition during the last decad...