This paper addresses the challenging problem of similarity search over widely distributed ultra-high dimensional data. Such an application is retrieval of the top-k most similar d...
The IST project Cuidado, which started in January 2001, aims at producing the first entirely automatic chain for extracting and exploiting musical metadata for browsing music. The...
We present a new statistical compression method, which we call Phrase Based Dense Code (PBDC), aimed at compressing large digital libraries. PBDC compresses the text collection to ...
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Schema matching is the problem of finding correspondences (mapping rules, e.g. logical formulae) between heterogeneous schemas e.g. in the data exchange domain, or for distribute...