— Recently several algorithms for clustering large data sets or streaming data sets have been proposed. Most of them address the crisp case of clustering, which cannot be easily generalized to the fuzzy case. In this paper, we propose a simple single pass (through the data) fuzzy c means algorithm that neither uses any complicated data structure nor any complicated data compression techniques, yet produces data partitions comparable to fuzzy c means. We also show our simple single pass fuzzy c means clustering algorithm when compared to fuzzy c means produces excellent speed-ups in clustering and thus can be used even if the data can be fully loaded in memory. Experimental results using five real data sets are provided.
Prodip Hore, Lawrence O. Hall, Dmitry B. Goldgof