Today’s data integration systems must be flexible enough to support the typical iterative and incremental process of integration, and may need to scale to hundreds of data sour...
Given a terabyte click log, can we build an efficient and effective click model? It is commonly believed that web search click logs are a gold mine for search business, because th...
Anitha Kannan, Chao Liu 0001, Christos Faloutsos, ...
Semantically heterogeneous and distributed data sources are quite common in several application domains such as bioinformatics and security informatics. In such a setting, each dat...
In this paper, we present an extension of PHIL, a declarative language for filtering information from XML data. The proposed approach allows us to extract relevant data as well a...
Distributed hash tables (DHTs) are very efficient for querying based on key lookups. However, building huge term indexes, as required for IR-style keyword search, poses a scalabil...
Odysseas Papapetrou, Wolf Siberski, Wolfgang Nejdl