Sciweavers

775 search results - page 8 / 155
» Email data cleaning
Sort
View
SIGMOD
2003
ACM
119views Database» more  SIGMOD 2003»
14 years 10 months ago
Robust and Efficient Fuzzy Match for Online Data Cleaning
To ensure high data quality, data warehouses must validate and cleanse incoming data tuples from external sources. In many situations, clean tuples must match acceptable tuples in...
Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, R...
ICTIR
2009
Springer
13 years 7 months ago
Training Data Cleaning for Text Classification
Abstract. In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain; strategies are thus needed for maximizing t...
Andrea Esuli, Fabrizio Sebastiani
ICDE
2006
IEEE
161views Database» more  ICDE 2006»
14 years 11 months ago
A Primitive Operator for Similarity Joins in Data Cleaning
Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...
Surajit Chaudhuri, Venkatesh Ganti, Raghav Kaushik
ICDE
2007
IEEE
146views Database» more  ICDE 2007»
14 years 11 months ago
Conditional Functional Dependencies for Data Cleaning
We propose a class of constraints, referred to as conditional functional dependencies (CFDs), and study their applications in data cleaning. In contrast to traditional functional ...
Philip Bohannon, Wenfei Fan, Floris Geerts, Xibei ...
SIGMOD
2005
ACM
107views Database» more  SIGMOD 2005»
14 years 10 months ago
A notation and system for expressing and executing cleanly typed workflows on messy scientific data
The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneou...
Yong Zhao, James E. Dobson, Ian T. Foster, Luc Mor...