We consider the problem of finding highly correlated pairs in a large data set. That is, given a threshold not too small, we wish to report all the pairs of items (or binary attri...
We investigate the problem of learning document classifiers in a multilingual setting, from collections where labels are only partially available. We address this problem in the ...
We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink ...
Barbara Poblete, Carlos Castillo, Aristides Gionis
We describe the WebCLEF 2008 task. Similarly to the 2007 edition of WebCLEF, the 2008 edition implements a multilingual "information synthesis" task, where, for a given t...
: Making cloud services responsive is critical to providing a compelling user experience. Many largescale sites, including LinkedIn, Digg and Facebook, address this need by deployi...