For discrete co-occurrence data like documents and words, calculating optimal projections and clustering are two different but related tasks. The goal of projection is to find a ...
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Krieg...
Abstract. Several new miners for frequent subgraphs have been published recently. Whereas new approaches are presented in detail, the quantitative evaluations are often of limited ...
Abstract. Problem solving with experiences that are recorded in text form requires a mapping from text to structured cases, so that case comparison can provide informed feedback fo...
Nirmalie Wiratunga, Robert Lothian, Sutanu Chakrab...
In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between tw...
We examine the problem of monitoring and identification of correlated burst patterns in multi-stream time series databases. Our methodology is comprised of two steps: a burst dete...
Cost-sensitive decision tree and cost-sensitive naïve Bayes are both new cost-sensitive learning models proposed recently to minimize the total cost of test and misclassifications...
In many classification and data-mining applications the user does not know a priori which distance measure is the most appropriate for the task at hand without examining the produ...
Abstract. We describe work aimed at cost-constrained knowledge discovery in the biomedical domain. To improve the diagnostic/prognostic models of cancer, new biomarkers are studied...
Locally Linear Embedding (LLE) has recently been proposed as a method for dimensional reduction of high-dimensional nonlinear data sets. In LLE each data point is reconstructed fro...
Claudio Varini, Andreas Degenhard, Tim W. Nattkemp...
This paper presents an unsupervised discretization method that performs density estimation for univariate data. The subintervals that the discretization produces can be used as the...