Finding Consistent Clusters in Data Partitions

15 years 11 months ago

Download www.lx.it.pt

Abstract. Given an arbitrary data set, to which no particular parametrical, statistical or geometrical structure can be assumed, diﬀerent clustering algorithms will in general produce diﬀerent data partitions. In fact, several partitions can also be obtained by using a single clustering algorithm due to dependencies on initialization or the selection of the value of some design parameter. This paper addresses the problem of ﬁnding consistent clusters in data partitions, proposing the analysis of the most common associations performed in a majority voting scheme. Combination of clustering results are performed by transforming data partitions into a co-association sample matrix, which maps coherent associations. This matrix is then used to extract the underlying consistent clusters. The proposed methodology is evaluated in the context of k-means clustering, a new clustering algorithm - voting-k-means, being presented. Examples, using both simulated and real data, show how this majo...

Ana L. N. Fred

Real-time Traffic