Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
This paper presents Weka4WS, a framework that extends the Weka toolkit for supporting distributed data mining on Grid environments. Weka4WS adopts the emerging Web Services Resourc...
Large graph analysis has become increasingly important and is widely used in many applications such as web mining, social network analysis, biology, and information retrieval. The...
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
The recent investigation of privacy-preserving data mining has been motivated by the growing concern about the privacy of individuals when their data is stored, aggregated, and mi...
Zhiqiang Yang, Rebecca N. Wright, Hiranmayee Subra...