We present a simple statistical model of molecular function evolution to predict protein function. The model description encodes general knowledge of how molecular function evolve...
Barbara E. Engelhardt, Michael I. Jordan, Steven E...
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Principal component analysis (PCA) minimizes the sum of squared errors (L2-norm) and is sensitive to the presence of outliers. We propose a rotational invariant L1-norm PCA (R1-PC...
Chris H. Q. Ding, Ding Zhou, Xiaofeng He, Hongyuan...
We address the problem of efficiently learning Naive Bayes classifiers under classconditional classification noise (CCCN). Naive Bayes classifiers rely on the hypothesis that the ...
Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, using Factored Markov Decision Processes (fmdps). However, these algorithms need ...
Thomas Degris, Olivier Sigaud, Pierre-Henri Wuille...
Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datas...
Locally adaptive classifiers are usually superior to the use of a single global classifier. However, there are two major problems in designing locally adaptive classifiers. First,...
Juan Dai, Shuicheng Yan, Xiaoou Tang, James T. Kwo...
Semi-Supervised Support Vector Machines (S3 VMs) are an appealing method for using unlabeled data in classification: their objective function favors decision boundaries which do n...
We study hierarchical classification in the general case when an instance could belong to more than one class node in the underlying taxonomy. Experiments done in previous work sh...
We derive a robust Euclidean embedding procedure based on semidefinite programming that may be used in place of the popular classical multidimensional scaling (cMDS) algorithm. We...