Clustering is an essential data mining task with numerous applications. However, data in most real-life applications are high-dimensional in nature, and the related information of...
Mining data streams is important in both science and commerce. Two major challenges are (1) the data may grow without limit so that it is difficult to retain a long history; and (...
This paper describes a successful but challenging application of data mining in the railway industry. The objective is to optimize maintenance and operation of trains through prog...
Relational graphs are widely used in modeling large scale networks such as biological networks and social networks. In this kind of graph, connectivity becomes critical in identif...
A lift curve, with the true positive rate on the y-axis and the customer pull (or contact) rate on the x-axis, is often used to depict the model performance in many data mining ap...
This paper describes TIPPPS (Time Interleaved Product Purchase Prediction System), which analyses billing data of corporate customers in a large telecommunications company in orde...
This paper considers the problem of modeling disease progression from historical clinical databases, with the ultimate objective of stratifying patients into groups with clearly d...
Ronald K. Pearson, Robert J. Kingan, Alan Hochberg
We propose a new text mining system which extracts characteristic contents from given documents. We define Key semantics as characteristic sub-structures of syntactic dependencie...
We present a probabilistic model-based framework for distributed learning that takes into account privacy restrictions and is applicable to scenarios where the different sites ha...