

Parallel Out-of-Core Divide-and-Conquer Techniques with Application to Classification Trees

14 years 7 months ago
Parallel Out-of-Core Divide-and-Conquer Techniques with Application to Classification Trees
Classification is an important problem in the field of data mining. Construction of good classifiers is computationally intensive and offers plenty of scope for parallelization. Divide-and-conquer paradigm can be used to efficiently construct decision tree classifiers. We discuss in detail various techniques for parallel divide-and-conquer and extend these techniques to handle efficiently disk-resident data. Furthermore, a generic technique for parallel out-ofcore divide-and-conquer problems is suggested. We present pCLOUDS, the parallel version of the decision tree classifier algorithm CLOUDS, capable of handling large outof-core data sets. pCLOUDS exhibits excellent speedup, sizeup and scaleup properties which make it a competitive tool for data mining applications. We evaluate the performance of pCLOUDS for a range of synthetic data sets on the IBM-SP2.
Mahesh K. Sreenivas, Khaled Alsabti, Sanjay Ranka
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where IPPS
Authors Mahesh K. Sreenivas, Khaled Alsabti, Sanjay Ranka
Comments (0)