Record matching is the task of identifying records that match the same real world entity. This is a problem of great significance for a variety of business intelligence applicatio...
The problem of identifying approximately duplicate objects in databases is an essential step for the information integration process. Most existing approaches have relied on gener...
An approximate search query on a collection of strings finds those strings in the collection that are similar to a given query string, where similarity is defined using a given si...
Abstract. This paper describes a method for correction of optically read Devanagari character strings using a Hindi word dictionary. The word dictionary is partitioned in order to ...
Data Cleaning is an important process that has been at the center of research interest in recent years. An important end goal of effective data cleaning is to identify the relatio...
Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Sr...