Knowledge discovery systems are constrained by three main limited resources: time, memory and sample size. Sample size is traditionally the dominant limitation, but in many present...
Abstract. One important challenge in data mining is to extract interesting knowledge and useful information for expert users. Since data mining algorithms extracts a huge quantity ...
We propose a novel method, called heterogeneous clustering ensemble (HCE), to generate robust clustering results that combine multiple partitions (clusters) derived from various cl...
Hye-Sung Yoon, Sang-Ho Lee, Sung-Bum Cho, Ju Han K...
Mars probes send back to Earth enormous amount of data. Automating the analysis of this data and its interpretation represents a challenging test of significant benefit to the doma...
Tomasz F. Stepinski, Soumya Ghosh, Ricardo Vilalta
Abstract. We investigate a generative latent variable model for modelbased word saliency estimation for text modelling and classification. The estimation algorithm derived is able ...
Abstract. To the best of our knowledge, this paper is the first attempt to formalise a pragmatic logic of scientific discovery in a manner such that it can be realised by scientist...
Jean Sallantin, Christopher Dartnell, Mohammad Afs...
Paleoclimatology requires the analysis of paleo time-series, obtained from a number of independent techniques and instruments, produced by several researchers and/or laboratories. ...
Incremental learning is an approach to deal with the classification task when datasets are too large or when new examples can arrive at any time. One possible approach uses concent...
Abstract. Clustering algorithms based on a matrix of pairwise similarities (kernel matrix) for the data are widely known and used, a particularly popular class being spectral clust...
In this paper, we propose a method for discovering hidden information from large-scale item set data based on the symmetry of items. Symmetry is a fundamental concept in the theory...