Systematic content screening of cell phenotypes in microscopic images has been shown promising in gene function understanding and drug design. However, manual annotation of cells and images in genome-wide studies is cost prohibitive. In this paper, we propose a highly efficient active annotation framework, in which a small amount of expert input is leveraged to rapidly and effectively infer the labels over the remaining unlabeled data. We formulate this as a graph based transductive learning problem and develop a novel method for label propagation. Specifically, a label regularizer method is proposed to handle the important label imbalance issue, typically seen in the cellular image screening applications. We also design a new scheme which breaks the graph into linear superposition of contributions from individual labeled samples. We take advantage of such a superposable representation to achieve fast annotation in an interactive setting. Extensive evaluations over toy data and realis...
Jun Wang, Shih-Fu Chang, Xiaobo Zhou, Stephen T. C