Sciweavers

67 search results - page 5 / 14
» A Primitive Operator for Similarity Joins in Data Cleaning
Sort
View
DMKD
2004
ACM
139views Data Mining» more  DMKD 2004»
14 years 23 days ago
Iterative record linkage for cleaning and integration
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Indrajit Bhattacharya, Lise Getoor
SIGMOD
2003
ACM
119views Database» more  SIGMOD 2003»
14 years 7 months ago
Robust and Efficient Fuzzy Match for Online Data Cleaning
To ensure high data quality, data warehouses must validate and cleanse incoming data tuples from external sources. In many situations, clean tuples must match acceptable tuples in...
Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, R...
IQIS
2007
ACM
13 years 9 months ago
Accuracy of Approximate String Joins Using Grams
Approximate join is an important part of many data cleaning and integration methodologies. Various similarity measures have been proposed for accurate and efficient matching of st...
Oktie Hassanzadeh, Mohammad Sadoghi, Renée ...
ICDE
2006
IEEE
119views Database» more  ICDE 2006»
14 years 8 months ago
Laws for Rewriting Queries Containing Division Operators
Relational division, also known as small divide, is a derived operator of the relational algebra that realizes a many-to-one set containment test, where a set is represented as a ...
Ralf Rantzau, Christoph Mangold
ICDE
2011
IEEE
238views Database» more  ICDE 2011»
12 years 11 months ago
Join queries on uncertain data: Semantics and efficient processing
— Uncertain data is quite common nowadays in a variety of modern database applications. At the same time, the join operation is one of the most important but expensive operations...
Tingjian Ge