The detection of correlations between different features in a set of feature vectors is a very important data mining task because correlation indicates a dependency between the fe...
Block-level sampling is far more efficient than true uniform-random sampling over a large database, but prone to significant errors if used to create database statistics. In this ...
We make two main contributions in this paper. First, we motivate and introduce a novel class of data mining problems that arise in labeling a group of mass spectra, specifically f...
We present several methods for mining knowledge from the query logs of the MSN search engine. Using the query logs, we build a time series for each query word or phrase (e.g., `Th...
Michail Vlachos, Christopher Meek, Zografoula Vage...
High-quality annotation of biological data is central to bioinformatics. Annotation using terms from ontologies provides reliable computational access to data. The Gene Ontology (...
Michael Bada, Daniele Turi, Robin McEntire, Robert...