Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have re...
The computation of covariance and correlation matrices are critical to many data mining applications and processes. Unfortunately the classical covariance and correlation matrices...
James Chilson, Raymond T. Ng, Alan Wagner, Ruben H...
Finding informative genes from microarray data is an important research problem in bioinformatics research and applications. Most of the existing methods rank features according t...
Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and mor...
Surveillance systems have long been used to monitor industrial processes and are becoming increasingly popular in public health and anti-terrorism applications. Most early detecti...
Numerous applications of data mining to scientific data involve the induction of a classification model. In many cases, the collection of data is not performed with this task in m...
Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
This paper describes a prototype that predicts the shopping lists for customers in a retail store. The shopping list prediction is one aspect of a larger system we have developed ...
Chad M. Cumby, Andrew E. Fano, Rayid Ghani, Marko ...
We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...