Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...
A similarity join correlating fragments in XML documents, which are similar in structure and content, can be used as the core algorithm to support data cleaning and data integratio...
The top-k similarity joins have been extensively studied and used
in a wide spectrum of applications such as information retrieval, decision
making, spatial data analysis and dat...
Existing sequence comparison software applications lack automation, abstraction, performance, and flexibility. Users need a new way of studying and applying sequence comparisons i...
Similarity joins in databases can be used for several important tasks such as data cleaning and instance-based data integration. In this paper, we explore ways how to support such ...