Frequent itemsets mining is well explored for various data types, and its computational complexity is well understood. Based on our previous work by Nguyen and Orlowska (2005), this paper shows the extension of the data pre-processing approach to further improve the performance of frequent itemsets computation. The methods focus on potential reduction of the size of the input data required for deployment of the partitioning based algorithms. We have made a series of the data pre-processing methods such that the final step of the Partition algorithm, where a combination of all local candidate sets must be processed, is executed on substantially smaller input data. Moreover, we have made a comparison among these methods based on the experiments with particular data sets.
Son N. Nguyen, Maria E. Orlowska