In this paper, we will examine the frequent pattern mining for uncertain data sets. We will show how the broad classes of algorithms can be extended to the uncertain data setting. In particular, we will discuss the candidate generate-andtest algorithms, hyper-structure algorithms and the pattern growth based algorithms. One of our insightful and interesting observations is that the experimental behavior of different classes of algorithms is very different in the uncertain case as compared to the deterministic case. In particular, the hyper-structure and the candidate generate-and-test algorithms perform much better than the tree-based algorithms. This counter-intuitive behavior compared to the case of deterministic data is an important observation from the perspective of frequent pattern mining algorithm design in the case of uncertain data. We will test the approach on a number of real and synthetic data sets, and show the effectiveness of two of our approaches over competitive techn...
Charu C. Aggarwal, Yan Li, Jianyong Wang, Jing Wan