In real-world machine learning problems, it is very common that part of the input feature vector is incomplete: either not available, missing, or corrupted. In this paper, we pres...
Chinese characters that are similar in their pronunciations or in their internal structures are useful for computer-assisted language learning and for psycholinguistic studies. Al...
Measuring image similarity is a central topic in computer vision. In this paper, we learn similarity from Flickr groups and use it to organize photos. Two images are similar if th...
Random forests are one of the most successful ensemble methods which exhibits performance on the level of boosting and support vector machines. The method is fast, robust to noise,...
Multiple data sources containing different types of features may be available for a given task. For instance, users’ profiles can be used to build recommendation systems. In a...