Spreadsheets applications allow data to be stored with low development overheads, but also with low data quality. Reporting on data from such sources is difficult using traditiona...
We present a novel probabilistic latent variable model to perform linear dimensionality reduction on data sets which contain clusters. We prove that the maximum likelihood solution...
—High-throughput data such as microarrays make it possible to investigate the molecular-level mechanism of cancer more efficiently. Computational methods boost the microarray ana...
Chan-Hoon Park, Soo-Jin Kim, Sun Kim, Dong-Yeon Ch...
The cluster assumption is exploited by most semi-supervised learning (SSL) methods. However, if the unlabeled data is merely weakly related to the target classes, it becomes quest...
Semi-supervised learning methods construct classifiers using both labeled and unlabeled training data samples. While unlabeled data samples can help to improve the accuracy of trai...