Sciweavers

777 search results - page 3 / 156
» Declarative Data Cleaning: Language, Model, and Algorithms
Sort
View
ICDE
2009
IEEE
121views Database» more  ICDE 2009»
14 years 9 months ago
Large-Scale Deduplication with Constraints Using Dedupalog
We present a declarative framework for collective deduplication of entity references in the presence of constraints. Constraints occur naturally in many data cleaning domains and c...
Arvind Arasu, Christopher Ré, Dan Suciu
DMKD
2004
ACM
139views Data Mining» more  DMKD 2004»
14 years 27 days ago
Iterative record linkage for cleaning and integration
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Indrajit Bhattacharya, Lise Getoor
DCC
2003
IEEE
14 years 7 months ago
PPM Model Cleaning
The Prediction by Partial Matching (PPM) algorithm uses a cumulative frequency count of input symbols in different contexts to estimate their probability distribution. Excellent c...
Milenko Drinic, Darko Kirovski, Miodrag Potkonjak
ICDE
2010
IEEE
224views Database» more  ICDE 2010»
14 years 7 months ago
Probabilistic Declarative Information Extraction
Abstract-Unstructured text represents a large fraction of the world's data. It often contain snippets of structured information within them (e.g., people's names and zip ...
Daisy Zhe Wang, Eirinaios Michelakis, Joseph M. He...
SAC
2005
ACM
14 years 1 months ago
The role of visualization in effective data cleaning
Using visualization techniques to assist conventional data mining tasks has attracted considerable interest in recent years. This paper addresses a challenging issue in the use of...
Yu Qian, Kang Zhang