Recent research has demonstrated the utility of using supervised classification systems for automatic identification of low quality microarray data. However, this approach requires annotation of a large training set by a qualified expert. In this paper we demonstrate the utility of an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and naive Bayes classification. On our test set, this system exhibits performance comparable to that of an analogous supervised learner constructed from the same training data. Keywords-microarray, quality control, EM algorithm, Naive Bayes
Brian E. Howard, Beate Sick, Imara Perera, Yang Ju