Sequential pattern mining has been an emerging problem in data mining. In this paper, we propose a new algorithm for mining frequent sequences. It processes only one scan of the da...
This paper presents Weka4WS, a framework that extends the Weka toolkit for supporting distributed data mining on Grid environments. Weka4WS adopts the emerging Web Services Resourc...
Logistic Model Trees have been shown to be very accurate and compact classifiers [8]. Their greatest disadvantage is the computational complexity of inducing the logistic regressi...
Bi-clustering is a promising conceptual clustering approach. Within categorical data, it provides a collection of (possibly overlapping) bi-clusters, i.e., linked clusters for both...
In most of the learning algorithms, examples in the training set are treated equally. Some examples, however, carry more reliable or critical information about the target than the ...
Ling Li, Amrit Pratap, Hsuan-Tien Lin, Yaser S. Ab...
Abstract. Most real-world datasets are, to a certain degree, skewed. When considered that they are also large, they become the pinnacle challenge in data analysis. More importantly...
For an undirected graph ¢ without self-loop, we prove: (i) that the number of closed patterns in the adjacency matrix of ¢ is even; (ii) that the number of the closed patterns i...
How can we generate realistic graphs? In addition, how can we do so with a mathematically tractable model that makes it feasible to analyze their properties rigorously? Real graphs...
Jure Leskovec, Deepayan Chakrabarti, Jon M. Kleinb...
Imbalanced data learning has recently begun to receive much attention from research and industrial communities as traditional machine learners no longer give satisfactory results. ...
We consider a problem of elastic matching of time series. We propose an algorithm that automatically determines a subsequence b of a target time series b that best matches a query ...
Longin Jan Latecki, Vasilis Megalooikonomou, Qiang...