Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Corruption of data by class-label noise is an important practical concern impacting many classification problems. Studies of data cleaning techniques often assume a uniform label ...
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehou...
Helena Galhardas, Daniela Florescu, Dennis Shasha,...
In this report, we provide a summary1 of the First Int'l VLDB Workshop on Clean Databases (CleanDB 2006), which took place at Seoul, Korea, on September 11, 2006, in conjunct...
The popularity of email has triggered researchers to look for ways to help users better organize the enormous amount of information stored in their email folders. One challenge th...