—We perform a statistical analysis and describe the asymptotic behavior of the frequency and size distribution of δoccurrent, minimal δ-occurrent, and maximal δ-occurrent item...
K-Nearest Neighbor is used broadly in text classification, but it has one deficiency—computational efficiency. In this paper, we propose a heuristic search way to find out the k ...
Chuanyao Yang, Yuqin Li, Chenghong Zhang, Yunfa Hu
- The classifier built from a data set with a highly skewed class distribution generally predicts the more frequently occurring classes much more often than the infrequently occurr...
— One way to handle data mining problems where class prior probabilities and/or misclassification costs between classes are highly unequal is to resample the data until a new, d...
—User profiles derived from Web navigation data are used in important e-commerce applications such as Web personalization, recommender systems, and Web analytics. In the open en...
—A new algorithm for minimal infrequent itemset mining is presented. Potential applications of finding infrequent itemsets include statistical disclosure risk assessment, bioinf...
—The information that exists on the World Wide Web is enormous enough in order to distract the users when trying to find useful information. In order to overcome the large amount...
Abstract — New privacy regulations together with everincreasing data availability and computational power have created a huge interest in data privacy research. One major researc...
Alina Campan, Traian Marius Truta, John Miller, Ra...
- Several algorithms have been introduced for mining frequent itemsets. The recent datasettransformation approach suffers either from the possible increasing in the number of struc...
- This paper investigates a technique for predicting ensemble uncertainty originally proposed in the weather forecasting domain. The overall purpose is to find out if the technique...