This paper describes a new program, correct, which takes words rejected by the Unix spell program, proposes a list of candidate corrections, and sorts them by probability. The pro...
Mark D. Kernighan, Kenneth Ward Church, William A....
We demonstrate that transformation-based learning can be used to correct noisy speech recognition transcripts in the lecture domain with an average word error rate reduction of 12...
Learning from noisy data is a challenging and reality issue for real-world data mining applications. Common practices include data cleansing, error detection and classifier ensemb...
Yan Zhang, Xingquan Zhu, Xindong Wu, Jeffrey P. Bo...
: We introduce an end-to-end framework for data quality that integrates business strategy, data quality models, and supporting investigative and governance processes. We also descr...
This paper proposes a modelling of Support Vector Machine (SVM) learning to address the problem of learning with sloppy labels. In binary classification, learning with sloppy labe...