We explore in this paper a progressive sampling algorithm, called Sampling Error Estimation (SEE), which aims to identify an appropriate sample size for mining association rules. S...
The usefulness of the results produced by data mining methods can be critically impaired by several factors such as (1) low quality of data, including errors due to contamination, ...
Fang Chu, Yizhou Wang, Carlo Zaniolo, Douglas Stot...
Abstract. The data stream model of computation is often used for analyzing huge volumes of continuously arriving data. In this paper, we present a novel algorithm called DUCstream ...
This paper proposes a novel anomaly detection system for spacecrafts based on data mining techniques. It constructs a nonlinear probabilistic model w.r.t. behavior of a spacecraft ...
With the rapid advance of the Internet, a large amount of sensitive data is collected, stored, and processed by different parties. Data mining is a powerful tool that can extract ...
This paper describes a theoretical approach on data mining, information classifying and a global overview of our OntoExtractor application, concerning the analysis of incoming data...
Zhan Cui, Ernesto Damiani, Marcello Leida, Marco V...
Clustering is an essential data mining task with numerous applications. However, data in most real-life applications are high-dimensional in nature, and the related information of...
Mining data streams is important in both science and commerce. Two major challenges are (1) the data may grow without limit so that it is difficult to retain a long history; and (...
This paper describes a successful but challenging application of data mining in the railway industry. The objective is to optimize maintenance and operation of trains through prog...