Sciweavers

VLDB
2004
ACM

Merging the Results of Approximate Match Operations

14 years 4 months ago
Merging the Results of Approximate Match Operations
Data Cleaning is an important process that has been at the center of research interest in recent years. An important end goal of effective data cleaning is to identify the relational tuple or tuples that are “most related” to a given query tuple. Various techniques have been proposed in the literature for efficiently identifying approximate matches to a query string against a single attribute of a relation. In addition to constructing a ranking (i.e., ordering) of these matches, the techniques often associate, with each match, scores that quantify the extent of the match. Since multiple attributes could exist in the query tuple, issuing approximate match operations for each of them separately will effectively create a number of ranked lists of the relation tuples. Merging these lists to identify a final ranking and scoring, and returning the top-K tuples, is a challenging task. In this paper, we adapt the well-known footrule distance (for merging ranked lists) to effectively dea...
Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Sr
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where VLDB
Authors Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Srivastava
Comments (0)