Abstract. This paper establishes the foundation for the performance measurements of privacy preserving data mining techniques. The performance is measured in terms of the accuracy ...
We propose adaptive nonlinear auto-associative modeling (ANAM) based on Locally Linear Embedding algorithm (LLE) for learning intrinsic principal features of each concept separatel...
Density-based clustering has the advantages for (i) allowing arbitrary shape of cluster and (ii) not requiring the number of clusters as input. However, when clusters touch each o...
Abstract. In this paper we present a novel and general framework based on concepts of relational algebra for kernel-based learning over relational schema. We exploit the notion of ...
Adam Woznica, Alexandros Kalousis, Melanie Hilario
Traditional multiple query optimization methods focus on identifying common subexpressions in sets of relational queries and on constructing their global execution plans. In this p...
: While feature selection is very difficult for high dimensional, unstructured data such as face image, it may be much easier to do if the data can be faithfully transformed into l...
Abstract. We apply a machine learning method to the occupation coding, which is a task to categorize the answers to open-ended questions regarding the respondent’s occupation. Sp...
Abstract. Bayesian spam filters, in general, compute probability estimations for tokens either without considering the email areas of occurrences except the body or treating the s...
In several contexts and domains, hierarchical agglomerative clustering (HAC) offers best-quality results, but at the price of a high complexity which reduces the size of datasets ...