Search Sciweavers | Sciweavers

153

SIGMOD
2003
ACM

119views Database» more SIGMOD 2003»

Robust and Efficient Fuzzy Match for Online Data Cleaning

16 years 6 months ago

To ensure high data quality, data warehouses must validate and cleanse incoming data tuples from external sources. In many situations, clean tuples must match acceptable tuples in...

Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, R...

claim paper

Read More »

180

Voted

ICTIR
2009
Springer

129views Information Technology» more ICTIR 2009»

Training Data Cleaning for Text Classification

15 years 4 months ago

Download nmis.isti.cnr.it

Abstract. In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain; strategies are thus needed for maximizing t...

Andrea Esuli, Fabrizio Sebastiani

claim paper

Read More »

270

click to vote

ICDE
2006
IEEE

161views Database» more ICDE 2006»

A Primitive Operator for Similarity Joins in Data Cleaning

16 years 8 months ago

Download research.microsoft.com

Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...

Surajit Chaudhuri, Venkatesh Ganti, Raghav Kaushik

claim paper

Read More »

279

Voted

ICDE
2007
IEEE

146views Database» more ICDE 2007»

Conditional Functional Dependencies for Data Cleaning

16 years 8 months ago

Download homepages.inf.ed.ac.uk

We propose a class of constraints, referred to as conditional functional dependencies (CFDs), and study their applications in data cleaning. In contrast to traditional functional ...

Philip Bohannon, Wenfei Fan, Floris Geerts, Xibei ...

claim paper

Read More »

240

Voted

SIGMOD
2005
ACM

107views Database» more SIGMOD 2005»

A notation and system for expressing and executing cleanly typed workflows on messy scientific data

16 years 6 months ago

Download www.ci.uchicago.edu

The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneou...

Yong Zhao, James E. Dobson, Ian T. Foster, Luc Mor...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers