Sciweavers

RECOMB
2002
Springer

Probabilistic hierarchical clustering for biological data

14 years 11 months ago
Probabilistic hierarchical clustering for biological data
Biological data, such as gene expression profiles or protein sequences, is often organized in a hierarchy of classes, where the instances assigned to "nearby" classes in the tree are similar. Most approaches for constructing a hierarchy use simple local operations, that are very sensitive to noise or variation in the data. In this paper, we describe probabilistic abstraction hierarchies (PAH) [11], a general probabilistic framework for clustering data into a hierarchy, and show how it can be applied to a wide variety of biological data sets. In a PAH, each class is associated with a probabilistic generative model for the data in the class. The PAH clustering algorithm simultaneously optimizes three things: the assignment of data instances to clusters, the models associated with the clusters, and the structure of the abstraction hierarchy. A unique feature of the PAH approach is that it utilizes global optimization algorithms for the last two steps, substantially reducing the...
Eran Segal, Daphne Koller
Added 03 Dec 2009
Updated 03 Dec 2009
Type Conference
Year 2002
Where RECOMB
Authors Eran Segal, Daphne Koller
Comments (0)