Data Integration via Constrained Clustering: An Application to Enzyme Clustering

14 years 10 months ago

Download www.cs.rpi.edu

When multiple data sources are available for clustering, an a priori data integration process is usually required. This process may be costly and may not lead to good clusterings, since important information is likely to be discarded. In this paper we propose constrained clustering as a strategy for integrating data sources without losing any information. It basically consists of adding the complementary data sources as constraints that the algorithm must satisfy. As a concrete application of our approach, we focus on the problem of enzyme function prediction, which is a hard task usually performed by intensive experimental work. We use constrained clustering as a means of integrating information from diverse sources as constraints, and analyze how this additional information impacts clustering quality in an enzyme clustering application scenario. Our results show that constraints generally improve the clustering quality when compared to an unconstrained clustering algorithm.

Elisa Boari de Lima, Raquel Cardoso de Melo Minard

Real-time Traffic

Clustering Algorithm | Complementary Data | Data Mining | Information Impacts | SDM 2011 |

claim paper

» ObjectSwapping for ResourceConstrained Devices

» Integrating metapath selection with userguided object clustering in heterogeneous informat...

» Integrated Task and Data Parallel Support for Dynamic Applications

» Merging Interface Schemas on the Deep Web via Clustering Aggregation

» Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

» Clustering Complex Data with GroupDependent Feature Selection

» Efficient Maximum Margin Clustering via Cutting Plane Algorithm

» VANTED A system for advanced data analysis and visualization in the context of biological ...

Post Info
More Details (n/a)

Added	17 Sep 2011
Updated	17 Sep 2011
Type	Journal
Year	2011
Where	SDM
Authors	Elisa Boari de Lima, Raquel Cardoso de Melo Minardi, Wagner Meira Jr., Mohammed Javeed Zaki

Comments (0)

Sciweavers

Data Integration via Constrained Clustering: An Application to Enzyme Clustering

Clustering Algorithm | Complementary Data | Data Mining | Information Impacts | SDM 2011 |

Explore & Download

Productivity Tools

Sciweavers