The recent years have witnessed a surge of interests of semi-supervised clustering methods, which aim to cluster the data set under the guidance of some supervisory information. U...
Stroke is the third leading cause of death and the principal cause of serious long-term disability in the United States. Accurate prediction of stroke is highly valuable for early...
Abstract In recent years, researchers have begun to study inductive databases, a new generation of databases for leveraging decision support applications. In this context, the user...
Biclustering refers to simultaneously capturing correlations present among subsets of attributes (columns) and records (rows). It is widely used in data mining applications includ...
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...