Bi-clustering is a promising conceptual clustering approach. Within categorical data, it provides a collection of (possibly overlapping) bi-clusters, i.e., linked clusters for both objects and attribute-value pairs. We propose a generic framework for bi-clustering which enables to compute a bi-partition from collections of local patterns which capture locally strong associations between objects and properties. To validate this framework, we have studied in details the instance CDK-Means. It is a K-Means-like clustering on collections of formal concepts, i.e., connected closed sets on both dimensions. It enables to build bi-partitions with a user control on overlapping between bi-clusters. We provide an experimental validation on many benchmark datasets and discuss the interestingness of the computed bi-partitions.
Ruggero G. Pensa, Céline Robardet, Jean-Fra