Sciweavers

IPL
2007

On constructing an optimal consensus clustering from multiple clusterings

13 years 11 months ago
On constructing an optimal consensus clustering from multiple clusterings
Computing a suitable measure of consensus among several clusterings on the same data is an important problem that arises in several areas such as computational biology and data mining. In this paper, we formalize a set-theoretic model for computing such a similarity measure. Roughly speaking, in this model we have k > 1 partitions (clusters) of the same data set each containing the same number of sets and the goal is to align the sets in each partition to minimize a similarity measure. For k = 2, a polynomial-time solution was proposed by Gusfield (Information Processing Letters, 82, pp. 159-164, 2002). In this paper, we show that the problem is MAX-SNP-hard for k = 3 even if each partition in each cluster contains no more than 2 elements and provide a 2 − 2 k -approximation algorithm for the problem for any k.
Piotr Berman, Bhaskar DasGupta, Ming-Yang Kao, Jie
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2007
Where IPL
Authors Piotr Berman, Bhaskar DasGupta, Ming-Yang Kao, Jie Wang
Comments (0)