We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman’s coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over others, and demonstrate our approach in document clustering and phylolinguistics.
Yee Whye Teh, Hal Daumé III, Daniel M. Roy