Background: The recent emergence of high-throughput automated image acquisition technologies has forever changed how cell biologists collect and analyze data. Historically, the interpretation of cellular phenotypes in different experimental conditions has been dependent upon the expert opinions of well-trained biologists. Such qualitative analysis is particularly effective in detecting subtle, but important, deviations in phenotypes. However, while the rapid and continuing development of automated microscope-based technologies now facilitates the acquisition of trillions of cells in thousands of diverse experimental conditions, such as in the context of RNA interference (RNAi) or small-molecule screens, the massive size of these datasets precludes human analysis. Thus, the development of automated methods which aim to identify novel and biological relevant phenotypes online is one of the major challenges in high-throughput image-based screening. Ideally, phenotype discovery methods sh...