Many scientific, financial, data mining and sensor network applications need to work with continuous, rather than discrete data e.g., temperature as a function of location, or sto...
Given a user-specified minimum correlation threshold and a market basket database with N items and T transactions, an all-strong-pairs correlation query finds all item pairs with...
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
Multi-valued dependencies (MVDs) are an important class of constraints that is fundamental for relational database design. Although modern applications increasingly require the su...
Many documents such as Web documents or XML files have tree structures. A term tree is an unordered tree pattern consisting of internal variables and tree structures. In order to ...