We investigate the problem of creating and analyzing samples of relational databases to find relationships between string-valued attributes. Our focus is on identifying attribute...
: Relational database systems are highly efficient hosts to table-shaped data. It is all the more interesting to see how a careful inspection of both, the XML tree structure as wel...
Record linkage is the problem of identifying similar records across different data sources. The similarity between two records is defined based on domain-specific similarity functi...
We present DITN, a new method of parallel querying based on dynamic outsourcing of join processing tasks to non-dedicated, heterogeneous computers. In DITN, partitioning is not th...
The similarity join is an important operation for mining high-dimensional feature spaces. Given two data sets, the similarity join computes all tuples (x, y) that are within a dis...