Efficient and effective analysis of large datasets from microarray gene expression data is one of the keys to time-critical personalized medicine. The issue we address here is the ...
This paper explores the challenge of scaling up language processing algorithms to increasingly large datasets. While cluster computing has been available in commercial environment...
Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of d...
More and more of our customers have to deal with very large datasets like elevation data and digital roadmaps covering Europe or even the entire world, very large images e.g. from...
Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when...
Giovanni Giuffrida, Wesley W. Chu, Dominique M. Ha...
Simulations and experiments in the fusion and plasma physics community generate large datasets at remote sites. Visualization and analysis of these datasets are difficult because ...
Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely st,udied problems in this area is the identification of clusters...
This paper explores unexpected results that lie at the intersection of two common themes in the KDD community: large datasets and the goal of building compact models. Experiments ...
We show how frequently occurring sequential patterns may be found from large datasets by first inducing a finite state automaton model describing the data, and then querying the m...
Timely and cost-effective processing of large datasets has become a critical ingredient for the success of many academic, government, and industrial organizations. The combination...