Web transaction data between web visitors and web functionalities usually convey users’ task-oriented behavior patterns. Clustering web transactions, thus, may capture such infor...
A precondition of existing ensemble-based distributed data mining techniques is the assumption that contributing data are identically and independently distributed. However, this a...
Yan Xing, Michael G. Madden, Jim Duggan, Gerard Ly...
Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of d...
We develop the notion of normalized information distance (NID) [7] into a kernel distance suitable for use with a Support Vector Machine classifier, and demonstrate its use for an...
Abstract. This paper proposes a new support vector machine (SVM) with a robust loss function for data mining. Its dual optimal formation is also constructed. A gradient based algor...
Execution cost of batched data mining queries can be reduced by integrating their I/O steps. Due to memory limitations, not all data mining queries in a batch can be executed toget...
Accurate probability-based ranking of instances is crucial in many real-world data mining applications. KNN (k-nearest neighbor) [1] has been intensively studied as an effective c...
In real-world data mining applications, an accurate ranking is same important to a accurate classification. Naive Bayes (simply NB) has been widely used in data mining as a simple...
Liangxiao Jiang, Harry Zhang, Zhihua Cai, Jiang Su
: Artificial Immune Systems are a new class of algorithms inspired by how the immune system recognizes, attacks and remembers intruders. This is a fascinating idea, but to be accep...
In this paper, we design genetic algorithm and simulated annealing algorithm and their parallel versions to solve the Closest String problem. Our implementation and experiments sho...