Sciweavers

SIGMOD
2008
ACM

DiMaC: a system for cleaning disguised missing data

14 years 11 months ago
DiMaC: a system for cleaning disguised missing data
In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially valid data values. Such missing values are known as disguised missing data, which may impair the quality of data analysis severely. The very limited previous studies on cleaning disguised missing data highly rely on domain background knowledge in specific applications and may not work well for the cases where the disguise values are inliers. Recently, we have studied the problem of cleaning disguised missing data systematically, and proposed an effective heuristic approach [2]. In this paper, we describe a demonstration of DiMaC, a Disguised Missing Data Cleaning system which can find the frequently used disguise values in data sets without requiring any domain background knowledge. In this demo, we will show (1) the critical techniques of finding suspicious disguise values; (2) the architecture and user interf...
Ming Hua, Jian Pei
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2008
Where SIGMOD
Authors Ming Hua, Jian Pei
Comments (0)