Sciweavers

32 search results - page 2 / 7
» Improving Data Cleaning Quality Using a Data Lineage Facilit...
Sort
View
SAC
2005
ACM
14 years 1 months ago
The role of visualization in effective data cleaning
Using visualization techniques to assist conventional data mining tasks has attracted considerable interest in recent years. This paper addresses a challenging issue in the use of...
Yu Qian, Kang Zhang
ICTIR
2009
Springer
13 years 5 months ago
Training Data Cleaning for Text Classification
Abstract. In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain; strategies are thus needed for maximizing t...
Andrea Esuli, Fabrizio Sebastiani
ENC
2006
IEEE
14 years 1 months ago
Cleaning Training-Datasets with Noise-Aware Algorithms
We introduce a novel learning algorithm for noise elimination. Our algorithm is based on the re-measurement idea for the correction of erroneous observations and is able to discri...
H. Jair Escalante
PAKDD
2005
ACM
160views Data Mining» more  PAKDD 2005»
14 years 28 days ago
Improving Mining Quality by Exploiting Data Dependency
The usefulness of the results produced by data mining methods can be critically impaired by several factors such as (1) low quality of data, including errors due to contamination, ...
Fang Chu, Yizhou Wang, Carlo Zaniolo, Douglas Stot...
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 7 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen