Sciweavers

PKDD
2005
Springer

A Random Method for Quantifying Changing Distributions in Data Streams

14 years 4 months ago
A Random Method for Quantifying Changing Distributions in Data Streams
In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between two datasets with class labels. Traditionally, changes are often measured by first estimating the probability distributions of the given data, and then computing the distance, for instance, the K-L divergence, between the estimated distributions. However, this approach is computationally infeasible for large, high dimensional datasets. The problem becomes more challenging in the streaming data environment, as the high speed makes it difficult for the learning process to keep up with the concept drifts in the data. To tackle this problem, we propose a method to quantify concept drifts using a universal model that incurs minimal learning cost. In addition, our model also provides the ability of performing classification.
Haixun Wang, Jian Pei
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where PKDD
Authors Haixun Wang, Jian Pei
Comments (0)